arpcard / rgi

Resistance Gene Identifier (RGI). Software to predict resistomes from protein or nucleotide data, including metagenomics data, based on homology and SNP models.
Other
330 stars 78 forks source link

How to filter out resistance genes in metagenomics? #148

Closed mhmism closed 1 year ago

mhmism commented 3 years ago

Hello,

Thank you for this amazing tool and your continuous maintenance and support.

I ran rgi bwt on some human shotgun metagenomics samples. I would like to know how to filter out the gene output file to avoid false positives. For example, should I keep only identified genes with 100% Average Percent Coverage and remove the rest? If only a few samples show one or two genes with 100% Average Percent Coverage and the rest none, can I stretch this percent a bit to include more genes in more samples?

Are there any other parameters or cut-off values that I should keep in mind before doing the downstream analyses in my samples?

Many thanks in advance! any help or guidance will be appreciated.

Kind regards Mahmoud

raphenya commented 1 year ago

@mhmism setting cutoff depends on your data. For example, you would not want to rely on 100% Average Percent Coverage if a lot of the reads are covering part of a particular gene. Hope that helps. Cheers.