al-mcintyre / merip_reanalysis_scripts

Scripts for paper entitled "Limits in the detection of m6A changes using MeRIP/m6A-seq
46 stars 32 forks source link

how did you decide the --gsize of human when using macs2 call m6A peaks #1

Closed sunhaifeng123 closed 2 years ago

sunhaifeng123 commented 4 years ago

Hi,

I'm a doctoral student from Nanjing Medical University. Recently I read your exquisite article <Limits in the detection of m6A changes using MeRip/m6A-seq>. I tried to use macs2 for our data application although exomePeak and MeTPeak will be my first choice in usual.

Here I have 2 questions:

  1. how to decide the --gsize of human as your code use 100e6 as follows: macs2 callpeak -t alignments/${COND}_IP_${REP}.star.sorted.bam -c alignments/${COND}_input_${REP}.star.sorted.bam --nomodel --extsize $FRAGLEN -g 100e6 -n macs2_results/${COND}_${REP}.macs2 -f BAM --verbose 3

  2. a recently published article also use macs2 for m6A peak calling. A parameter in their methods "–slocal 200" puzzled me and do you think its a necessary parameter for m6A?

Thanks in advance for your reply!

Best wishes

Haifeng Sun Nanjing Medical University, China 2020-06-07

al-mcintyre commented 4 years ago

Hi Haifeng,

As we were mainly interested in the detection of peak changes, we did not fully evaluate the parameter space for peak detection, so you may find additional adjustments worthwhile. This paper by Zeng et al. also discusses optimization of this step.

  1. 100e6 is an approximation of the size of the human transcriptome. Many papers that use MACS2 for MeRIP-seq analysis have stuck with the default genome size and still found enrichment of DRAC/RRACH motifs. The main benefit of using MACS2 is the speed. If you have a single smaller data set, MeTPeak worked well for us (exomePeak less so, but based on our limited validation with MeRIP-RT-qPCR, and it may be that MeRIP-RT-qPCR is less sensitive).

  2. From MACS2 : "--slocal, --llocal These two parameters control which two levels of regions will be checked around the peak regions to calculate the maximum lambda as local lambda. By default, MACS considers 1000bp for small local region(--slocal), and 10000bps for large local region(--llocal) which captures the bias from a long-range effect like an open chromatin domain. You can tweak these according to your project. Remember that if the region is set too small, a sharp spike in the input data may kill a significant peak." We did not find it necessary to detect enrichment of the m6A motif, but as with other parameters, it may be worth adjusting for your data.

Good luck and let me know if you have any further questions!