ncbi / amr

AMRFinderPlus - Identify AMR genes and point mutations, and virulence and stress resistance genes in assembled bacterial nucleotide and protein sequence.
https://www.ncbi.nlm.nih.gov/pathogens/antimicrobial-resistance/AMRFinder/
Other
265 stars 37 forks source link

Empty NUC_FASTA_OUT #115

Closed GomathiNayagam closed 1 year ago

GomathiNayagam commented 1 year ago

Hi, I am trying to find AMR genes in NUC_FASTA. Though AMRfinder identifies AMR in the input, it outputs an empty NUC_FASTA_OUT. I am probably making a silly error here. Could you help me out?

amrfinder --nucleotide $NUC_FASTA --ident_min 0.9 --coverage_min 0.89 --nucleotide_output out.fasta --threads 6 --output amrout.tsv

evolarjun commented 1 year ago

Hi @GomathiNayagam ,

I don't see a silly error, and I'm having trouble reproducing your issue. With the test data distributed with AMRFinderPlus I get FASTA output:

$ NUC_FASTA=test_dna.fa
$ amrfinder --nucleotide $NUC_FASTA  --ident_min 0.9 --coverage_min 0.89 --nucleotide_output out.fasta --threads 6 --output amrout.tsv
Running: amrfinder --nucleotide test_dna.fa --ident_min 0.9 --coverage_min 0.89 --nucleotide_output out.fasta --threads 6 --output amrout.tsv
Software directory: '/panfs/pan1.be-md.ncbi.nlm.nih.gov/bacterial_pathogens/backup/packages/AMRFinderPlus_v3.11.4/'
Software version: 3.11.4
Database directory: '/panfs/pan1.be-md.ncbi.nlm.nih.gov/bacterial_pathogens/backup/packages/AMRFinderPlusData/2023-02-23.1'
Database version: 2023-02-23.1
AMRFinder translated nucleotide search
  - include -O ORGANISM, --organism ORGANISM option to add mutation searches and suppress common proteins
Running blastx ...
Making report ...
AMRFinder took 9 seconds to complete

The results look as I would expect:

$ wc -l amrout.tsv
6 amrout.tsv
$ fgrep -c '>' out.fasta
5

Could you paste in what AMRFinderPlus prints to the screen and attach your $NUC_FASTA file? That might help me reproduce your issue.

Thanks, Arjun

GomathiNayagam commented 1 year ago

Hi, I too tried with the test data and it produces nuc_output. But it doesn't work for my file. So, I think it could be an error from my file. Please find my fasta file attached.

This is what the AMRfinder prints out.

Software directory: '/home/gomathinayagam/miniconda3/envs/amrfinder/bin/'
Software version: 3.11.4
Database directory: '/home/gomathinayagam/miniconda3/envs/amrfinder/share/amrfinderplus/data/2023-02-23.1'
Database version: 2023-02-23.1
AMRFinder translated nucleotide search
  - include -O ORGANISM, --organism ORGANISM option to add mutation searches and suppress common proteins
Running tblastn ...
Making report ...
AMRFinder took 16 seconds to complete

test.gz

vbrover commented 1 year ago

Thank you for reporting this! It is a bug in amrfinder: leading underscore symbol is trimmed from the contig name in the report. This will be fixed in version 3.11.5.

GomathiNayagam commented 1 year ago

Yes, it is the 'unusual' FASTA header. It works after editing the headers. Thank you very much!

evolarjun commented 1 year ago

@GomathiNayagam I'm glad to hear it's working for you.

I'm going to reopen just for our tracking because we consider this a bug and plan to release a fix with the next AMRFinderPlus software release.

Thanks for reporting!

evolarjun commented 1 year ago

So the bug itself was fixed in release 3.11.8, but additionally we changed the behavior so if there is an error no empty -o file will be created (release 3.11.14).