deeptools / deepTools

Tools to process and analyze deep sequencing data.
Other
665 stars 205 forks source link

black bars in plotheatmap #1305

Closed sunta3iouxos closed 3 months ago

sunta3iouxos commented 3 months ago

Hi all, I have the following issue: Screenshot 2024-04-08 171628 Screenshot 2024-04-08 171649

In both heatmaps there are those black bars that contaminated the the graph. Do you have any idea why those appear and how can I remove them? The steps I am following till that graph are:

Generate the BigWig files, I am comparing the ChIPed merged bam file Vs the Input (no antibody), I am doing this for 4 distinct conditions:

bamCompare --bamfile1 KO-HG.bam --bamfile2 A006200376_218474_S26_L001.bam -o log2ratio_KO-HG.bw \
--centerReads --binSize 20 --smoothLength 60 --ignoreDuplicates  --scaleFactorsMethod SES --outFileFormat bigwig -p 2 \
--blackListFileName /mnt/c/Users/trotos/Desktop/bwa/mm39.excluderanges.bed  \
--ignoreForNormalization chrX,chrY,chrM,GL456210.1,GL456211.1,GL456212.1,GL456213.1,GL456216.1,GL456219.1,GL456221.1,GL456233.1,GL456239.1,GL456350.1,GL456354.1,GL456359.1,GL456360.1,GL456366.1,GL456367.1,GL456368.1,GL456370.1,GL456372.1,GL456378.1,GL456379.1,GL456381.1,GL456382.1,GL456383.1,GL456385.1,GL456387.1,GL456389.1,GL456390.1,GL456392.1,GL456393.1,GL456394.1,GL456396.1,JH584292.1,JH584293.1,JH584294.1,JH584295.1,JH584296.1,JH584297.1,JH584298.1,JH584299.1,JH584300.1,JH584301.1,JH584302.1,JH584303.1,JH584304.1,chr1_GL456210v1_random,chr1_GL456211v1_random,chr1_GL456212v1_random,chr1_GL456221v1_random,chr1_GL456239v1_random,chr1_MU069434v1_random,chr4_JH584295v1_random,chr5_GL456354v1_random,chr5_JH584296v1_random,chr5_JH584297v1_random,chr5_JH584298v1_random,chr5_JH584299v1_random,chr7_GL456219v1_random,chrM,chrUn_GL456359v1,chrUn_GL456360v1,chrUn_GL456366v1,chrUn_GL456367v1,chrUn_GL456368v1,chrUn_GL456370v1,chrUn_GL456372v1,chrUn_GL456378v1,chrUn_GL456379v1,chrUn_GL456381v1,chrUn_GL456382v1,chrUn_GL456383v1,chrUn_GL456385v1,chrUn_GL456387v1,chrUn_GL456389v1,chrUn_GL456390v1,chrUn_GL456392v1,chrUn_GL456394v1,chrUn_GL456396v1,chrUn_JH584304v1,chrUn_MU069435v1,chrX_GL456233v2_random,chrY_JH584300v1_random,chrY_JH584301v1_random,chrY_JH584302v1_random,chrY_JH584303v1_random 

creating the matrix file for the plots from the generated .bw files: for TSS

computeMatrix reference-point --regionsFileName /home/tgeorgom/mm39/UCSC_Main_on_Mouse__knownGene_genome.bed  \
--scoreFileName *bw \
--missingDataAsZero --skipZeros \
--referencePoint TSS  \
--upstream 1000 --downstream 1000 \
--outFileName referenceTSS-Zero-mergedBW_allGenes_scaled.gz --outFileNameMatrix referenceTSS-Zero-mergedBW_allGenes_scaled.tab --outFileSortedRegions referenceTSS-Zero-mergedBW_allGenes_scaled.bed \
--blackListFileName bwa/mm39.excluderanges.bed \
--smartLabels -p 16 

for genebody:

computeMatrix scale-regions -R /home/tgeorgom/mm39/UCSC_Main_on_Mouse__knownGene_genome.bed \
-S *bw \
--upstream 1500 --downstream 1500 \
  --regionBodyLength 5000 \
  --startLabel TSS --endLabel TES --skipZeros -p 16\
  -o scale-mergedBW_allGenes_scaled.gz \
  --outFileNameMatrix scale-mergedBW_allGenes_scaled.tab \
  --outFileSortedRegions scale-mergedBW_allGenes_scaled.bed \

and then for ploting, using the generated gz files:

plotHeatmap -m referenceTSS-Zero-mergedBW_allGenes_scaled.gz  --outFileName referenceTSS-Zero-mergedBW_allGenes_scaled.pdf --colorMap RdYlBu --alpha 0.8
plotHeatmap -m scale-mergedBW_allGenes_scaled.gz --outFileName scale-mergedBW_allGenes_scaled.pdf --colorMap RdYlBu --alpha 0.8

NOTE: bed files of all genes for the mm39 generated using the the UCSC browser as instructed: Screenshot 2024-04-08 171408

WardDeb commented 3 months ago

Have a look at --missingDataAsZero (computeMatrix) or --missingDataColor (plotHeatmap)

sunta3iouxos commented 3 months ago

Have a look at --missingDataAsZero (computeMatrix) or --missingDataColor (plotHeatmap)

thank you, for your comment. so using those I will avoid the black lines? Why is not set as default?

WardDeb commented 3 months ago

yes, they set the missing values (the black lines) to zero. I wouldn't set this as the default, as you might have the impression no data is missing otherwise. Feel free to re-open if something is still unclear.