deeptools / deepTools

Tools to process and analyze deep sequencing data.
Other
684 stars 212 forks source link

bw files "No common chromosomes found" when it's from the same species and assemblies #1319

Closed ShannonTown closed 1 month ago

ShannonTown commented 4 months ago

Welcome to deepTools GitHub repository! Before opening the issue please check that the following requirements are met :

I have two bigwig files downloaded from GEO accession

GSM6757703 and GSM6757712

When I try to computeMatrix and plotHeatmap for these two bw together with some other bw files, I got the error of "No common chromosomes found" while I'm sure they are from the same species and assemblies (dm6), if I remove these two bw files from the list, it works. If I do the same thing for these two bw files alone, it also works to some degree, except that it's skipping 90% of my regions in the bed file (which works for other bw files), when these regions apparently exist -- they look normal in IGV browser in parallel with my other bw files.

Would anyone please help me check if these two bw files are problematic somehow and how to fix it? Thank you very much in acvance!

Error when these two bw files processed with my other bw files:

No common chromosomes found. Are the bigwig files from the same species and same assemblies?
chromosome      length
           chr4    1348131
          chr3L   28110227
           chrX   23542271
           chrY    3667352
          chr2L   23513712
          chr3R   32079331
          chr2R   25286936

and the following is the list of the unmatched chromosome and chromosome
lengths from file
../result_computeMatrix/bw/GSM6757703_orer_1to15_atac_7.bw
chromosome      length
chrX_CP007103v1_random       33304
chrUn_DS484407v1              2156
chrX_DS485044v1_random        1263
chrUn_DS485820v1              1041
chrX_DS484664v1_random        1690
chrY_DS484175v1_random        2705
chrX_DS485164v1_random        1217
........... (many lines)

Error when these two bw files processed alone:

The following chromosome names did not match between the bigwig files
chromosome  length
chrY_DS485764v1_random        1052
chrUn_DS483808v1          5046
chrUn_DS485330v1          1161
........ (many lines)
Skipping chrX:107843-109755, due to being absent in the computeMatrix output.
Skipping chrX:114047-116407, due to being absent in the computeMatrix output.
Skipping chr3L:216625-218058, due to being absent in the computeMatrix output.
Skipping chr3L:217932-223069, due to being absent in the computeMatrix output.
Skipping chr4:221425-222508, due to being absent in the computeMatrix output.
Skipping chr4:226318-226986, due to being absent in the computeMatrix output.
Skipping chr3L:247491-250305, due to being absent in the computeMatrix output.
Skipping chr4:268175-276369, due to being absent in the computeMatrix output.
Skipping chr4:310930-314087, due to being absent in the computeMatrix output.
Skipping chrX:318526-323554, due to being absent in the computeMatrix output.
Skipping chr3L:343378-346957, due to being absent in the computeMatrix output.
Skipping chr4:344055-346552, due to being absent in the computeMatrix output.
Skipping chr4:431771-435221, due to being absent in the computeMatrix output.
Skipping chrX:496597-497374, due to being absent in the computeMatrix output.
Skipping chrX:509435-528028, due to being absent in the computeMatrix output.
Skipping chrX:513106-523226, due to being absent in the computeMatrix output.
........ (many lines)

Here are the chromosomes from the two bw files and one of my other bw files:

**> bigWigInfo ../result_computeMatrix/bw/**GSM6757703_orer_1to15_atac_7.bw** -chroms | grep -v _**

version: 4
isCompressed: yes
isSwapped: 0
primaryDataSize: 118,893,817
primaryIndexSize: 1,116,460
zoomLevels: 10
chromCount: 1866
        chr2L 0 23513708
        chr2R 1 25286813
        chr3L 2 28110223
        chr3R 3 32079303
        chr4 4 1348051
        chrM 5 19480
        chrX 1219 23542260
        chrY 1666 3667348
basesCovered: 167,659,499
mean: 30.716409
min: 0.000000
max: 13084.000000
std: 88.534428

**> bigWigInfo ../result_computeMatrix/bw/**GSM6757712_orer_25to3_atac_7.bw** -chroms |grep -v _**

version: 4
isCompressed: yes
isSwapped: 0
primaryDataSize: 122,280,320
primaryIndexSize: 1,117,468
zoomLevels: 10
chromCount: 1865
        chr2L 0 23513708
        chr2R 1 25286766
        chr3L 2 28110218
        chr3R 3 32079324
        chr4 4 1348048
        chrM 5 15645
        chrX 1218 23542262
        chrY 1665 3667348
basesCovered: 170,229,219
mean: 33.683454
min: 0.000000
max: 5943.000000
std: 61.363906

**> bigWigInfo ../result_computeMatrix/bw/**ANOTHER.bw** -chroms | grep -v _**

version: 4
isCompressed: yes
isSwapped: 0
primaryDataSize: 48,036,868
primaryIndexSize: 439,536
zoomLevels: 10
chromCount: 7
        chr2L 0 23513712
        chr2R 1 25286936
        chr3L 2 28110227
        chr3R 3 32079331
        chr4 4 1348131
        chrX 5 23542271
        chrY 6 3667352
basesCovered: 137,547,960
mean: 0.000000
min: -0.498420
max: 459.999908
std: 1.000000

My bed file is also attached. bed.txt

ShannonTown commented 4 months ago

I realized that the chromosome lengths have to be the same among bw files, not only the names. The two problematic bw files had different chromosome lengths although they were mapped to the same assembly dm6. So I converted the bw to bedgraph, then adjusted the chromosome lengths to the same, then converted the bedgraph back to bw. Now everything works

WardDeb commented 1 month ago

To me this sounds like different subversions of dm6, and I'd be careful just changing these lengths, as your coordinates won't match anymore. Safest bet to make sure these are really the same reference is to align them yourself.. This isn't a deepTools issue, though.