ncbi / amr

AMRFinderPlus - Identify AMR genes and point mutations, and virulence and stress resistance genes in assembled bacterial nucleotide and protein sequence.
https://www.ncbi.nlm.nih.gov/pathogens/antimicrobial-resistance/AMRFinder/
Other
256 stars 34 forks source link

amrfinder-3.11.2 test_amrfinder.sh :: failure #110

Closed EricDeveaud closed 1 year ago

EricDeveaud commented 1 year ago

fresh install of amrfinder-3.11.2 from tagged release

test fail see:

rpm_maker:amrfinder/amrfinder-3.11.2 > ./test_amrfinder.sh 
Testing ./amrfinder
Running: ./amrfinder --plus -p test_prot.fa -g test_prot.gff -O Escherichia
Software directory: '/opt/gensoft/src/amrfinder/amrfinder-3.11.2/'
Software version: 3.11.2
Database directory: '/opt/gensoft/data/amrfinder/3.11.2/2022-12-19.1'
Database version: 2022-12-19.1
AMRFinder protein-only and mutation search
  - include -n NUC_FASTA, --nucleotide NUC_FASTA and -g GFF_FILE, --gff GFF_FILE options to add translated searches
Running blastp...
Running hmmsearch...
Making report...
AMRFinder took 2 seconds to complete
Files test_prot.expected and test_prot.got differ
Test failed: 
  ./amrfinder --plus -p test_prot.fa -g test_prot.gff -O Escherichia > test_prot.got
  diff test_prot.expected test_prot.got 
13c13
< nimIJ_hmm     contigX 1       501     +       nimIJ   NimIJ family nitroimidazole resistance protein  core    AMRAMR     NITROIMIDAZOLE  NITROIMIDAZOLE  HMM     166     165     98.18   76.54   162     WP_005812825.1  NimIJ family nitroimidazole resistance protein     NF000262.1      NimIJ family nitroimidazole resistance protein
---
> nimIJ_hmm     contigX 1       501     +       nimIJ   NimIJ family 5-nitroimidazole reductase core    AMR     AMRNITROIMIDAZOLE  NITROIMIDAZOLE  HMM     166     165     98.18   76.54   162     WP_005812825.1  NimIJ family 5-nitroimidazole reductase    NF000262.1      NimIJ family 5-nitroimidazole reductase

guess DB changed since exptected results were generated and reference result was not regenerated since.

regards

EricDeveaud commented 1 year ago

NB test with dna input works as expected

evolarjun commented 1 year ago

Hi Eric,

I think you're right as to the cause.

The good thing is your install of AMRFinderPlus looks like it's working ok. The new, improved, name for genes identified with the nimIJ HMM is NimIJ family 5-nitroimidazole reductase. See https://www.ncbi.nlm.nih.gov/pathogens/genehierarchy/#nimIJ

We should revise instructions and/or the script to download the latest test results from github before running the script so that people don't run into this issue.

Please try re-downloading the test data and expected results and re-running the test, and let us know if you still have problems. E.g.,:

          BASE_URL="https://raw.githubusercontent.com/ncbi/amr/master"
          curl --silent -L \
             -O ${BASE_URL}/test_dna.fa \
             -O ${BASE_URL}/test_prot.fa \
             -O ${BASE_URL}/test_prot.gff \
             -O ${BASE_URL}/test_both.expected \
             -O ${BASE_URL}/test_dna.expected \
             -O ${BASE_URL}/test_dna_mut_all.expected \
             -O ${BASE_URL}/test_prot.expected \
             -O ${BASE_URL}/test_amrfinder.sh
          bash -x ./test_amrfinder.sh

Thanks for bringing this up.

Arjun

evolarjun commented 1 year ago

Hi Eric,

Thanks for your help, one other question occurred to me. You said "fresh install of amrfinder-3.11.2 from tagged release". Where did you get the source/binaries/distribution from?

I replaced the binary tarball attached to release 3.11.2 with one that contains the correct expected test output for the latest database, and the expected test output on the master branch in git was also updated. Did you get it from bioconda? I'm just trying to figure out the simplest and best way prevent this kind of false error report from our tests.

I'd prefer not to require another download for every test (which would fail for people running AMRFinderPlus from disconnected devices).

Thanks again for your report, Arjun

EricDeveaud commented 1 year ago

Yes I should have mentionned the name swithc to 5-nitroimidazole reductase, sorry to forgot.

I will download the new exoected results and run the test again. thanks for your reactivity.

by fresh install I mean

wget https://github.com/ncbi/amr/archive/refs/tags/amrfinder_v3.11.2.tar.gz

and then build the tool

we always tend to build our own tools for our cluster installation. use of binary is the last solution for us ;-)

evolarjun commented 1 year ago

Thanks, I forgot that the tagged version is completely static. Our installation tests don't use that, but check out whatever is on the master branch since we only alter the source code of that branch at release time (that way our automated public dev regression and prod installation tests are identical). We have a much more extensive regression suite that we run internally, but trigger manually since it's large, slow, requires extensive approvals for basically any change, and includes test data not intended for public consumption.

I didn't want to do a new software version for a change in the test result data because of a database update, but maybe I should next time something like that happens. It's not super common that database updates change results included in the test data. Sorry this bit you.

EricDeveaud commented 1 year ago

no problem is now documented. ;-) maybee a make test rule in the. main makefile to check to be run before a new tag//release will helpto avoid this corner case.

anyway it is now sorted and amrfinder is working as expected.

thanks again

Eric

evolarjun commented 1 year ago

@EricDeveaud We just finished releasing a new database version (2023-02-23.1) that again changes the test output. Re-downloading as above should fix the tests.

In the next release of AMRFinderPlus software a new test_database.sh will download fresh test data if possible prior to running the test, so this problem will be less of one. The other option is to distribute the test data with the database, but again software versions and database versions could both affect test output, so it would still be complicated.

Thanks again for bringing this to our attention.

EricDeveaud commented 1 year ago

thanks for the work put in this issue. I really appreciate.

regards

Eric