genome / bam-readcount

Count bases in BAM/CRAM files
MIT License
298 stars 95 forks source link

Max depth 8000 #69

Closed LauraVP1994 closed 2 years ago

LauraVP1994 commented 3 years ago

Dear,

Althought this would be a great tool, I encountered quite a big problem for my data with bam-read-count. Just like samtools depth it seems not to deal well with high coverages. I have coverages ranging from 5000-90 000X and bamreadcount seems to limit itself to 8000. I found a few posts that tried to deal with this with the -d option. However this doesn't seem to work.

Maybe an update or a clear workaround is necessary for people that work with high coverages like viral data.

Kind regards Laura

cjfields commented 3 years ago

I was able to get this to work as follows up to 10,000,000x depth (I think the default). This may require both the reference FASTA and the segment at the end.

# filter bases at Q30
bam-readcount -b 30 -w 1 -f $REFERENCE my.sorted.bam chrID

EDIT: I highly recommend the -w 1 option to limit potential warnings, otherwise you can get a flood. It does take a while to plow through ultra high coverage data (guessing from dealing with the huge pileup).