macs3-project / MACS

MACS -- Model-based Analysis of ChIP-Seq
https://macs3-project.github.io/MACS/
BSD 3-Clause "New" or "Revised" License
713 stars 268 forks source link

Q: mac2 callpeak divide by zero error for redundant rate of treatment calculation #488

Open crowja opened 3 years ago

crowja commented 3 years ago

Use case

I am trying to run macs2 callpeak 2.2.7.1 using a ChIP-seq alignment produced by bwa mem 0.7.17. The alignment has been sorted with samtools sort. As far as the alignment goes, samtools stats reports:

SN      raw total sequences:    1385028 # excluding supplementary and secondary reads
SN      filtered sequences:     0
SN      sequences:      1385028
SN      is sorted:      1
SN      1st fragments:  692514
SN      last fragments: 692514
SN      reads mapped:   866791
SN      reads mapped and paired:        749976  # paired-end technology bit set + both mates mapped
SN      reads unmapped: 518237
SN      reads properly paired:  0       # proper-pair bit set
SN      reads paired:   1385028 # paired-end technology bit set
SN      reads duplicated:       0       # PCR or optical duplicate bit set
SN      reads MQ0:      460530  # mapped and MQ=0
SN      reads QC failed:        0
SN      non-primary alignments: 0

Describe the problem

I'm getting a Divide by zero error in macs2 callpeak related to the redundant rate of treatment.

Describe the solution you tried

From the macs2 callpeak output:

INFO  @ Mon, 01 Nov 2021 14:42:38:
# Command line: callpeak --format=BAM --nomodel --extsize=300 --treatment=/opt/data/chipseq/aligns-with- 
bwa/AN3740008-paired-sorted.bam --gsize=2134628047 --outdir=results --name=AN3740008-paired-sorted
# ARGUMENTS LIST:
# name = AN3740008-paired-sorted
# format = BAM
# ChIP-seq file = ['/opt/data/chipseq/aligns-with-bwa/AN3740008-paired-sorted.bam']
# control file = None
# effective genome size = 2.13e+09
# band width = 300
# model fold = [5, 50]
# qvalue cutoff = 5.00e-02
# The maximum gap between significant sites is assigned as the read length/tag size.
# The minimum length of peaks is assigned as the predicted fragment length "d".
# Larger dataset will be scaled towards smaller dataset.
# Range for calculating regional lambda is: 10000 bps
# Broad region calling is off
# Paired-End mode is off

INFO  @ Mon, 01 Nov 2021 14:42:38: #1 read tag files...
INFO  @ Mon, 01 Nov 2021 14:42:38: #1 read treatment tags...
INFO  @ Mon, 01 Nov 2021 14:42:41: 0 reads have been read.
INFO  @ Mon, 01 Nov 2021 14:42:41: #1 tag size is determined as 137 bps
INFO  @ Mon, 01 Nov 2021 14:42:41: #1 tag size = 137.0
INFO  @ Mon, 01 Nov 2021 14:42:41: #1  total tags in treatment: 0
INFO  @ Mon, 01 Nov 2021 14:42:41: #1 user defined the maximum tags...
INFO  @ Mon, 01 Nov 2021 14:42:41: #1 filter out redundant tags at the same location and the same strand by allowing at most 1 tag(s)
INFO  @ Mon, 01 Nov 2021 14:42:41: #1  tags after filtering in treatment: 0
Traceback (most recent call last):
  File "/home/ubuntu/miniconda3/envs/bio-hichip-env/bin/macs2", line 653, in <module>
  main()
  File "/home/ubuntu/miniconda3/envs/bio-hichip-env/bin/macs2", line 51, in main
  run( args )
  File "/home/ubuntu/miniconda3/envs/bio-hichip-env/lib/python3.9/site-packages/MACS2/callpeak_cmd.py", line 
  104, in run
  info("#1  Redundant rate of treatment: %.2f", float(t0 - t1) / t0)
  ZeroDivisionError: float division by zero

Additional context

This error is reported when using bwa mem alignments. When using bowtie2 alignments, no problem.

taoliu commented 3 years ago

@crowja MACS will check the 'properly paired' bit. It seems to be zero:

reads properly paired:  0