Closed juyanmei closed 3 years ago
Hi,
That is rather confusing. Would you be able to transfer the BAM file to me (it will include the reads)?
Also, just to check you are using CoverM 0.6.1?
Thanks.
Hi, Thanks for your replay. I am using both 0.6.1 and 0.6.0, it's basically equal. I'll email you the BAM file.
Thanks for sending the files.
Looks like a simple case of min-covered-fraction kicking in, in particular RDPYD19080571_A-RDPYD19080571_A.metaspades.maxbin2_dastools.bin.001_sub
has a coverage of 29.412193 but only 0.022369059 (ie. 2.2%) of its genome covered. Lots more mappings are observed when you rerun genome with --min-covered-fraction 0
.
The cached BAM file is the cache of all the mappings observed - mappings effectively classed as 'spurious' like those above (and those that fail --min-read-aligned-percent
etc.) are not removed from the BAM file after the mapping has taken place. This is helpful so rerunning with different thresholds is fast, but admittedly somewhat confusing. I've added some detail about this is the help in response.
Make sense?
Sorry for the late reply. I got it. The difference between unmapped reads of abundance output and mapped reads of samtools flagstat because of the defalut threshold --min-covered-fraction.
Hi, I used this code to calculate abundance of genome "coverm genome -p bwa-mem --coupled demo.1.fq.gz demo.2.fq.gz --genome-fasta-directory ./dereplicated_genomes/ --bam-file-cache-directory ./ -o output.tsv", the final unmapped reads abundance is 85%, but this code "samtools flagstat coverm-genome.demo.1.fq.gz.bam" showed mapped reads is 92%, I am really confused.