ncbi / amr

AMRFinderPlus - Identify AMR genes and point mutations, and virulence and stress resistance genes in assembled bacterial nucleotide and protein sequence.
https://www.ncbi.nlm.nih.gov/pathogens/antimicrobial-resistance/AMRFinder/
Other
256 stars 34 forks source link

Allow download of specific database versions #139

Open marchoeppner opened 4 months ago

marchoeppner commented 4 months ago

Hi,

new AMRfinderplus user - very useful tool, thanks!

I may be missing something, but it appears that the only way to automatically download/build the database is through the amrfinder_update executable. The way I read the code is that it will always download the "latest" release of the database. While this is mostly fine, it does not conform well with the concept of versioning workflows - i.e. I can guarantee that all the data processing up to the AMRfinderplus stage is always identical, but the database may differ depending on when the user installed the workflow. This is not a very good situation imho.

Any chance that the database installation function could be changed to accept specific release tags or similar? ( I guess a workaround may be to just download the un-indexed database files "manually" and figure out the indexing commands, but it would be nice to have that be part of the default installation procedure).

Cheers Marc

vbrover commented 4 months ago

but the database may differ depending on when the user installed the workflow. This is not a very good situation imho.

The code of AMRFinderPlus itself also depends on when the user installed the software.

Any chance that the database installation function could be changed to accept specific release tags or similar?

We do not test this functionality, but theoretically it is possible: any previous release of AMRFInderPlus is available and any previous release of the database can be downloaded from https://ftp.ncbi.nlm.nih.gov/pathogen/Antimicrobial_resistance/AMRFinderPlus/database/. Not every software-database combination will work though.

I guess a workaround may be to just download the un-indexed database files "manually" and figure out the indexing commands,

"Indexing" is only running makeblastdb and hmmpress.

evolarjun commented 4 months ago

To add to what Slava said above, you can download the database files from our FTP site and run amrfinder_index <database_dir> to (re)index the database you downloaded (see also the documentation).

We also build docker containers frozen with combinations of software and database on every database and/or software release on dockerhub starting with 3.10.14-2021-08-11.1 (See https://hub.docker.com/r/ncbi/amr/tags)

marchoeppner commented 4 months ago

Thanks, I will look into the download and re-indexing option.

My workflow uses bioconda and biocontainers (depending on the user preference), so the version of Amrfinder is basically "locked". The next step would be to also lock the database version so everything is fully version-controlled.