macs3-project / MACS

MACS -- Model-based Analysis of ChIP-Seq
https://macs3-project.github.io/MACS/
BSD 3-Clause "New" or "Revised" License
697 stars 268 forks source link

Q: How to use HMMRATAC to give us a narrow peak file and have signalValue, p and q-value ? #648

Open MinaMyr opened 2 months ago

MinaMyr commented 2 months ago

Use case Hello I want to do peak calling on my ATAC seq data, and for that I use, callpeak and hmmratac.

Describe the problem My problem here is with hmmratac;

I use this command macs3 hmmratac -f BAM -i RES/atac/try_1/04-NobiasedRegions/ATAC20_D0_rep1_hg38_sort_dedup_biasedRegions.bam -u 15 -l 5 -c 100 -n RES/atac/try_1/06-PeakCalling/hmmratac_narrow/ATAC20_D0_rep1 I read in the doc that we should have a narrow peak file format but I have an *_accessible_regions.gappedPeak format which is not what I want. For the rest of my analysis I need a narrow peak format.

Describe the solution you tried So I tried to create my narrow peak file from the gappedPeak, by hand, but the trouble I am now having is that for the signal value, p-value and q-value are equals to 0 (I think like this comment said it is intentional https://github.com/macs3-project/MACS/issues/592#issuecomment-1981383873 ). Here is what my file look like;

track name="peak" description="peak" type=gappedPeak nextItemButton=on
20  208210  220320  peak_1  0   .   208450  220210  0   3   1240,8820,1550  240,1520,10450  0   0   0
20  220410  267710  peak_2  0   .   220490  267640  0   1   47150   80  0   0   0
20  279100  302420  peak_3  0   .   279110  302390  0   8   890,9080,6700,310,240,3640,1240,660 10,1010,10140,17070,17420,17720,21380,22630 0   0   0
20  302430  314800  peak_4  0   .   302470  314780  0   4   8730,450,2840,130   40,8880,9350,12220  0   0   0
20  314890  342820  peak_5  0   .   314920  342760  0   7   7230,4790,1170,3260,2190,4780,4170  30,7400,12210,13400,16670,18880,23700   0   0   0
20  342840  365480  peak_6  0   .   342880  365440  0   6   5620,10270,120,1640,2240,2450   40,5690,15990,16150,17830,20150 0   0   0
20  368100  402600  peak_7  0   .   368140  402540  0   4   2600,3260,230,28060 40,2650,5950,6380   0   0   0
20  443320  447340  peak_8  0   .   443330  447170  0   2   2860,910    10,2940 0   0   0
20  447420  479680  peak_9  0   .   447440  479630  0   2   15710,16460 20,15750    0   0   0
20  479740  480080  peak_10 0   .   479850  480010  0   1   160 110 0   0   0

I don't really know what to do now, since signal Value p-value and q-value are important for the rest of my analysis. Thank you in advance !

nickgladman commented 2 months ago

Hello and thanks for making this resource. I am commenting here to track this since I'm having the same issue: no narrowPeak output from the hmmratac using mapped .bam files with the following code: macs3 hmmratac \ -i D*rep1_paired.sorted.bam \ *rep2_paired.sorted.bam \ --outdir directory \ -n output_name \ -u 20 \ -l 10 \ -c 2

I previously ran the with the --cutoff-analysis-only flag prior to entering the -u -l -c params. I'm using version 3.0.1 in a conda environment with a SGE scheduler. There are no errors in the output. Here are the last lines of the run output:

INFO @ 14 Jun 2024 17:06:54: [7795 MB] # Write the output... INFO @ 14 Jun 2024 17:09:30: [7795 MB] # Write accessible regions in a gappedPeak file: *accessible_regions.gappedPeak