macs3-project / MACS

MACS -- Model-based Analysis of ChIP-Seq
https://macs3-project.github.io/MACS/
BSD 3-Clause "New" or "Revised" License
695 stars 270 forks source link

Q: how to use MACS3 for ATAC seq with "himmratac" option? #639

Open kbarrr opened 3 months ago

kbarrr commented 3 months ago

Use case My data consist of BAM paired-end files already filtered by duplicates, mit reads, etc. (99% alingment)

Describe the problem I tried to use this command in bash

macs3 hmmratac -i M1_dedup_noM_sorted.bam -f BAMPE -g hs -n test -B -q 0.01

and got this error: usage: macs3 [-h] [--version] {callpeak,bdgpeakcall,bdgbroadcall,bdgcmp,bdgopt,cmbreps,bdgdiff,filterdup,predictd,pileup,randsample,refinepeak,callvar,hmmratac} ... macs3: error: unrecognized arguments: -g hs -B -q 0.01

Additional context macs2 works well, but i still want to learn MACS3.

Any help would be useful. Thank you.

taoliu commented 3 months ago

Hello @kbarrr the tutorial on using hmmratac command in the MACS3 (version 3.0) can be found here:

https://macs3-project.github.io/MACS/docs/hmmratac.html

The command line options are different from the callpeak command in MACS3.

Let us know if you find any issues when running hmmratac in MACS3. We are still optimizing it. @philippadoherty is currently working on reducing the memory usage for this command.

kbarrr commented 3 months ago

Thank you for your response.

I followed the instructions:

First i run

macs3 hmmratac -i M1_noM_sorted.bam -f BAMPE --cutoff-analysis-only

to know the upper and lower cut off and got this TSV file:

(macs3-env) $ head tsv score npeaks lpeaks avelpeak 990.00 5 7609 1521.80 980.00 5 7643 1528.60 970.00 5 7650 1530.00 960.00 5 7654 1530.80 950.00 5 7659 1531.80 940.00 5 7662 1532.40 930.00 5 7665 1533.00 920.00 5 7673 1534.60 910.00 5 7679 1535.80 (macs3-env) $ tail tsv 90.00 171 313505 1833.36 80.00 201 353408 1758.25 70.00 263 415032 1578.07 60.00 385 523102 1358.71 50.00 615 695166 1130.35 40.00 1107 995113 898.93 30.00 2509 1774048 707.07 20.00 5513 3736145 677.70 10.00 14149 11493688 812.33 0.00 169628 2841257445 16749.93

i am expecting ~15-30 K peaks because they are CD4 cells.

would you help me choose my upper and lower cut off please? i am lost

taoliu commented 2 months ago

@kbarrr Thanks for the information! However, the resolution of the cutoff analysis report is not good enough for me to judge where the cutoff should be -- each step is 10 according to the first column. I would guess the settings of '-u 20 -l 10 -c 2' (or -c 5) should work. We will release a new version soon to address 1) the resolution of cutoff analysis -- to make it with a step of 1 by default #642; 2) to reduce memory usage by a magnitude for large datasets #640 ; 3) to provide a simpler HMM model with Poisson Emission #635. If you can wait, please try the next release.