genome / bam-readcount

Count bases in BAM/CRAM files
MIT License
298 stars 95 forks source link

read depth incorrect #103

Closed bioscienceresearch closed 2 years ago

bioscienceresearch commented 2 years ago

I am using bam-readcount with the following command:

bam-readcount -w1 -f NC_045512_2.fa aligned_data.bam NC_045512.2:1-29903

Also:

bam-readcount -w1 -f NC_045512_2.fa aligned_data.bam NC_045512.2:8782-8782

The output read counts from bam-readcount (eg 5-12) is two to three orders of magnitude lower than number of reads mapping to specific locations (c. 2000 to 8782). Tested on multiple bam files.

For bam files with low read coverage, the counts are correct.

version: 1.0.1-unstable-8-c7c76e6 (commit c7c76e6)

bioscienceresearch commented 2 years ago

I found the issue, due to Marking reads as PCR duplicates (1024/1040 FLAG). have taking out duplicate marking step. Bam-readcount giving correct depth.

chrisamiller commented 2 years ago

Glad you figured it out!