wwood / CoverM

Read coverage calculator for metagenomics
GNU General Public License v3.0
273 stars 30 forks source link

NaN in the relative abundance output file #132

Closed B-1991-ing closed 1 year ago

B-1991-ing commented 1 year ago

Hi Ben,

I am using the coverm to calculate the relative abundance of some downloaded MAG int my metagenome fastq reads, but I only got unmapped reads is the 100% and the relative relative of all MAGs are all are "NaN". Theoretically, the relative abundance can't be NaN.

I checked the MAG fasta file reads, some MAGs have sequences as uppercase - CGTTCCACTGGGGATACCGG, and some have sequences as lowercase - tctcttcgcttcacgaaaccttcagcggcgcc. Is it a problem?

Coverm command line: coverm genome -m relative_abundance --coupled 1.fq.gz 2.fq.gz --genome-fasta-directory ${genomes_dir} -x fna -o ${out_tsv} --min-read-percent-identity 0.95 --min-read-aligned-percent 0.70 --threads 20

Error file coverm_Uzun_MTB_38_batch.err.txt

Could you give me some suggestion?

Best,

Bing

wwood commented 1 year ago

Hi Bing,

This is an odd one. I wouldn't have thought upper vs lower case would make any difference. You could try to check if you wanted by simulating some reads for mapping e.g. with https://github.com/wwood/bbbin/blob/main/sim_reads.py and check that works.

You could also try mapping without the cutoffs, or mapping just with minimap2 of the reads against one of the genomes you expect to be there.

My feeling is that something nefarious is going on. Maybe if the above fails could you provide some sequences to me so I can poke arund?

Thanks. ben

B-1991-ing commented 1 year ago

Hi Ben,

Thank you for your reply.

I firstly calculated the relative abundance for downloaded 38 MAGs from other paper in my metagenomes, and there were all NaN in the output file. But, when I calculated the downloaded MAGs with my own MAGs together in my metagenomes, everything is fine now.

It is not related with the upper or lower case sequences in the MAG file.

Best,

Bing

wwood commented 1 year ago

Ah ok, I guess the originally tried MAGs must have failed the 10% covered_bases threshold then. Makes sense.