deeptools / deepTools

Tools to process and analyze deep sequencing data.
Other
685 stars 213 forks source link

subset computeMatrix matrix file without computeMatrixOperations #1326

Open BioLaoXu opened 2 months ago

BioLaoXu commented 2 months ago

Thanks to the great tools provided by the deeptools development team.

I have obtained a file with the computeMatrix reference-point --referencePoint TSS command, but my TSS result is not very ideal (the input sample is more abundant than the TSS of the IP sample), so I only focus on the TSS distribution of the macs2 identified peak region, reading the official documentation computeMatrixOperations, I found that computeMatrixOperations subset can only filter the sample/strands/value information , So I did filter matrix by myself,I'm not sure if this is feasible, here's my script, I tested it worked and found that there was no error:

##subset matrix
gzip -dc matrix_TSS_2K.gz |sed -r 's/^@/#@/'> matrix_TSS_2K
bedtools intersect -a matrix_TSS_2K -b test_peaks.narrowPeak -wa >matrix_TSS_2K.peak
rawnum=$(tail +2 matrix_TSS_2K|wc -l)
peaknum=$(cat matrix_TSS_2K.peak|wc -l)
cat <(head -1 matrix_TSS_2K|sed -r "s/$rawnum/$peaknum/"|sed -r 's/^#@/@/') matrix_TSS_2K.peak | gzip  > matrix_TSS_2K.peak.gz
##plot
plotHeatmap -m matrix_TSS_2K.peak.gz -out All_TSS_Heatmap.png --colorMap RdYlBu_r
plotProfile -m matrix_TSS_2K.peak.gz  -out All_TSS_Profile.png

matrix_TSS_2K.gz file :generated from computeMatrix reference-point --referencePoint TSS test_peaks.narrowPeak file :generated from macs2

image

Tool version

computeMatrix 3.5.4 
Python 3.8.16

I wonder if this is feasible,thanks