liguowang / spiker

Analysis tool for Spike-in ChIP-seq data
MIT License
4 stars 1 forks source link

The number of peak called by spiker.py is less than macs2 #5

Closed liaomeimei closed 1 year ago

liaomeimei commented 1 year ago

Hi,

Thanks for developing a great tool to analyze ChIP-seq data with spike-in. I compared the peak numbers obtained by spiker.py and MACS2, but I found that the number of peaks called by spiker.py is less than MACS2, even though I chose scale factors that were almost equal to 1. However, the lost peaks are real enrichment in IGV. My codes are listed below:

spiker.py -t S2-0_S16.bam -c ../C-Input_S13.bam -g mm --cleanup --spikeIn --csf 1.0001 --tsf 0.9999 -o ../../peaks/after_spike/S2-0_raw 2> ../../peaks/after_spike/log

macs2 callpeak -c ../C-Input_S13.rmdup.bam -t S2-0_S16.rmdup.bam -f BAM -B -g mm -n S2-0_S16 --outdir ../../peaks/ 2> macs2.log

And the peak numbers are:

The lost peaks in igv as example: 560064e8e719d892a794099b72a0d75

Thank you!

liguowang commented 1 year ago

The example you gave is not a good one :). Peak_823 is not highly confident, in my opinion.

However, I am unsurprised to see some inconsistencies between Spiker and MACS2 (even though Spiker internally used the MACS command line tool to do the peak calling). You can consider MACS2 as a pipeline and the command line tool as breakdown steps. I remember some MACS2 users reported similar problems (i.e., cannot exactly reproduce MACS2 results using the command line tools).

At the current stage, I am afraid there is little I can do to address this issue.