arpcard / rgi

Resistance Gene Identifier (RGI). Software to predict resistomes from protein or nucleotide data, including metagenomics data, based on homology and SNP models.
Other
328 stars 76 forks source link

Inconsistent results of 'rgi main' and 'rgi bwt' on the same sequencing data #259

Closed JemmaSun closed 6 months ago

JemmaSun commented 10 months ago

Hi!

I recently ran 'rgi bwt' on metagenomic sequences (FASTQ reads, default setting), which gives me over 700 ARO terms. Next, I ran 'rgi main' on the metagenome assemblies (reads were assembled by megahit) of the same sequences (FASTA - prodigal called genes; default setting with -t protein) to see how many of the reads AROs were represented at the contig level. However, almost no AROs identified from assemblies share the same ARO terms with reads. This happens to most of my soil samples: image

I then collected the reads that mapped to my contigs, and ran rgi bwt on the collected reads. Unsurprisingly, the detected AROs from the mapped reads did not match the assembly AROs. While I can understand that rgi bwt and rgi main query AROs in different ways, still wondering why the results are so different and if there's a way to improve the consistency.

Thank you!

raphenya commented 10 months ago

@JemmaSun I think soil samples will have different sequences for AMR; CARD sequences are mostly clinical sequences. We have a separate dataset called card variants that might help in this. The CARD Variants contain variants of canonical AMR sequences we have in CARD.

Are the soil samples linked to human activity?

With rgi bwt, kma procudes better alignment as compared to Bowtie2. Which aligner were you using?

raphenya commented 10 months ago

@JemmaSun If you can, please share with me samples S94 (high mapped reads) and S99 (low mapped reads)

JemmaSun commented 10 months ago

Sure! May I have your email address please?

------------------ Original ------------------ From: amos @.> Date: Tue, Dec 5, 2023 9:31 AM To: arpcard/rgi @.> Cc: JemmaSun @.>, Mention @.> Subject: Re: [arpcard/rgi] Inconsistent results of 'rgi main' and 'rgi bwt' onthe same sequencing data (Issue #259)

@JemmaSun If you can, please share with me samples S94 (high mapped reads) and S99 (low mapped reads)

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

raphenya commented 10 months ago

@JemmaSun please send to card@mcmaster.ca or Google share will also work. Cheers.

JemmaSun commented 9 months ago

Hi! I've sent the reads via email. Please let me know if you cannot access them. Cheers.

raphenya commented 9 months ago

@JemmaSun Ok, I'm downloading the samples and I will test. Cheers.

raphenya commented 8 months ago

@JemmaSun what is the command you used for megahit assembly?

JemmaSun commented 8 months ago

Hi, I used default settings of megahit megahit -1 "%$%_FORWARD_READ_FP_%$%" -2 "%$%_REVERSE_READ_FP_%$%" -t %$%_NUM_CPUS_%$% -o $OUTPUT_DIR

github-actions[bot] commented 6 months ago

Issue is stale and will be closed in 7 days unless there is new activity