Closed kimj50 closed 4 years ago
Hello Jun,
Nice catch! What you reported is not the intended output, this will require some time to look into it.
For hicAdjustMatrix, I guess you can do it for each chromosomes one by one for the time being, for hicPCA, I think you just need to hicAdjustMatrix the pearson matrix. But this should be more streamlined in the next patch.
Hi Jun, Thanks for using out last version, May I ask you to plot the first file chromosome by chromosome and see if it still looks the same, I wonder if it has something to do with the plotting. If you use adjusted matrix to make pearson it should actually generates pearson matrix on the adjusted coordinates. and computes the pca also over the adjusted ones. IS it what you did? You could please try to output and plot the pearson matrix from hicPCA and see if it looks the same when you input an adjusted matrix.Thanks!
"May I ask you to plot the first file chromosome by chromosome and see if it still looks the same, I wonder if it has something to do with the plotting." - I think it definitely has to do with the plotting, because if i plot each chromosome, the plot looks fine.
"If you use adjusted matrix to make pearson it should actually generates pearson matrix on the adjusted coordinates. and computes the pca also over the adjusted ones. IS it what you did?" yes, in my first post, the pm is the -pm output from hicPCA.
"You could please try to output and plot the pearson matrix from hicPCA and see if it looks the same when you input an adjusted matrix" it seems like -pm output of hicPCA isn't just plotting problem.
On the side note (not surprisingly), the pca values of the two cases are the same:
adjustmatrix (chrI) -> adjustmatrix (center) -> hicPCA I 0 4650000 0.030266865887 I 4650000 4660000 0.036209768885 I 4660000 4670000 0.028621320046 I 4670000 4680000 0.024591755204 I 4680000 4690000 0.019458173186 I 4690000 4700000 0.029596640952 I 4700000 4710000 0.027017999888 I 4710000 4720000 -0.007823169806 I 4720000 4730000 -0.002924065376 I 4730000 4740000 -0.003656820642
adjustmatrix (center of every chromosome) -> hicPCA I 0 4650000 0.030266865887 I 4650000 4660000 0.036209768885 I 4660000 4670000 0.028621320046 I 4670000 4680000 0.024591755204 I 4680000 4690000 0.019458173186 I 4690000 4700000 0.029596640952 I 4700000 4710000 0.027017999888 I 4710000 4720000 -0.007823169806 I 4720000 4730000 -0.002924065376 I 4730000 4740000 -0.003656820642
Thanks! - Jun
so the above matrices are Pearson correlation you got as output of hicPCA on a adjusted matrix? I have just used version 3.4.3 and could not reproduce this issue . Could you please send me the command you use to generate your adjusted matrix and then the pea values?
Hi, I'm still using 3.4.2. I forgot to mention that the matrix comes from .hic file, converted from hicConvertMatrix, but normalized using ICE method using hicexplorer.
hicAdjustMatrix -m matrix.h5 \ -r regions.txt \ --action keep \ -o matrix_regions.h5
hicPCA -m matrix_regions.h5 \ -noe 1 \ -f bedgraph \ --method dist_norm \ -pm .matrix_regions_norm_pm.h5 \ --ignoreMaskedBins \ -o matrix_regions_pca1.bedgraph
hicPlotMatrix -m matrix_regions_norm_pm.h5 -o regions_pm.png hicPlotMatrix -m matrix_regions.h5 -o regions.png --log1p
first 3 lines of matrix_regions_pca1.bedgraph I 0 4650000 -0.022717729954 I 4650000 4700000 -0.041453508754 I 4700000 4750000 0.001597827105
So two things that I can think of: 1) could you please try to plot one chromosome at a time? 2) Am I right that your point is that you did not keep beginning of the chromosome I but you can see the pc values were assigned to that coordinate?
I have tried the following and I cannot generate your issue :
hicexplorer --version
hicexplorer 3.4.3
hicAdjustMatrix -m matrix.h5 -r regions2keep.bed --action keep -o matrix_regions.h5
hicPCA -m matrix_regions.h5 -o pc1.bw pc2.bw --method dist_norm --chromosomes 2L 2R 3L 3R X --pearsonMatrix pearson_matrix.h5 --extraTrack h3k27ac.bw --histonMarkType active
hicPlotMatrix -m pearson_matrix.h5 -o pearson_2R.png --region 2R --vMin -1 --vMax 1 --colorMap RdBu_r
hicPlotMatrix -m matrix_region.h5 -o matrix_2R.png --region 2R --log1p
If I save pc as bedgraph I see: 2R 5790000 5820000 -0.055295331700 2R 5820000 5850000 -0.050785457847 2R 5850000 5880000 -0.053331785979 2R 5880000 5910000 -0.049209458601 2R 5910000 5940000 -0.057686252053 2R 5940000 5970000 -0.056164342927
so all the coordinates are fine and I cannot reproduce your problem.
Hi,
I'm still using 3.4.2. I forgot to mention that the matrix comes from .hic file, converted from hicConvertMatrix, but normalized using ICE method using hicexplorer.
This is actually a very important information. I recently fixed a bug in the load and store function for cool files converted from hic. Please make sure you install HiCMatrix in version 13, and do the conversion from the hic file again.
Best,
Joachim
Hi, I updated to 3.4.3. And I started from .hic file. hicexplorer --version hicexplorer 3.4.3 -rw-r----- 1 kimj50 users 10836 Jan 23 11:45 conda-meta/hicmatrix-11-py_0.json.c~ -rw-r----- 1 kimj50 users 10794 May 22 20:22 conda-meta/hicmatrix-13-py_0.json my conda-meta directory seems to have both 11 and 13...could the hicexplorer be using the older version?
I 0 10130000 -0.013724358038 I 10130000 10140000 -0.052584451939 I 10140000 10150000 -0.028132542570 I 10150000 10160000 -0.019077970874
Concerning the different versions, please use a new conda environment: conda create --name hic3.4.3 hicexplorer=3.4.3 hicmatrix=13
and activate it via conda activate hic3.4.3
We have published version 3.5 with many bug fixes. Please reopen if this bug is still existing in this version.
Hi, I am using the latest version 3.7.2. In my case, this problem still exists. I used hicAdjustMatrix to mask the first half of one chromosome.
The bed file (addedauto.bed) used for masking: OW028702.1 0 18450000
However, in hicPCA result: OW028702.1 0 18550000 0.045427123859 OW028702.1 18550000 18600000 0.048286874344 OW028702.1 18600000 18650000 0.062323953915 ....
my commands: hicAdjustMatrix -m corrected.h5 --regions addedauto.bed -a mask -o addedsex.h5 hicPCA -m addedsex.h5 -o addedsex.h5.pc1.bed -we 1 -f bedgraph --ignoreMaskedBins
Hi, Thank you again for the update : ). I've been playing with the new version (3.4) and I noticed a few things...
first file: hicadjustmatrix --region (center regions of all chromosome) --action keep it seems like 'keep' doesn't completely remove the beginning of the chromosomes after the first chromosome?...
second file: hicadjustmatrix --chromosome > hicadjustmatrix --region (center region of the specific chromosome) --action keep If I do the same but instead only on single chromosome, it seems to work well.
On second matrix (center of single chromosome) > hicPCA: bedgraph output I 0 4650000 0.030266865887 I 4650000 4660000 0.036209768885 I 4660000 4670000 0.028621320046
hicPCA seems to always start from 0. I can simply fix this by substituting 0 with 4640000. But is the pca computed properly? because the pearson matrix also looks funny, despite the original matrix looking normal:
Thank you! - Jun