Closed tseemann closed 5 years ago
We generally link via the protein accession, (WP001028144.1 in your example), but I can see we're inconsistent with the nucleotide accessions; the NG records don't have a version, but the GenBank accessions do. For the purposes of our database protein sequences are primary, so we usually link via protein sequence/accession. We'll make things consistent in a future release.
Actually NG_nnnnnnn.v
do have version numbers, including v=2 and v=3, you have 31 of them?
% cut -f11 ReferenceGeneCatalog.txt | grep '\.[2-9]$' | uniq
NG_052583.3
NG_047295.2
NG_047307.2
NG_056002.2
NG_048749.2
NG_048791.2
NG_048905.2
NG_049041.2
NG_049089.2
NG_049323.2
NG_062218.2
NG_057591.2
NG_057597.2
NG_049984.2
NG_050235.2
NG_050242.2
NG_055993.2
NG_047699.2
NG_047784.2
NG_055651.2
NG_055784.2
NG_060581.2
NG_052176.2
NG_050472.2
NG_050504.2
NG_048128.2
NG_048275.2
NG_050504.2
NG_048128.2
NG_048275.2
NG_048525.2
NG_048542.2
NC_000913.3
NC_003197.2
Yes, that's what I meant when I said "inconsistent". I had missed this previously because our internal use of AMR_CDS is limited. We'll get this fixed so the '.version' is included for our next database release.
Thanks @evolarjun
@tseemann This should be fixed in the latest release of the database (2019-10-30.1) Note that the FTP site paths have changed so the software will not break by updating to backwards incompatible database versions. The new site is at https://ftp.ncbi.nlm.nih.gov/pathogen/Antimicrobial_resistance/AMRFinderPlus/database/latest/
I hadn't mentioned before that the Reference Gene Catalog is made for external consumption, and should have the data you need. It also has a Web interface. New documentation for ReferenceGeneCatalog.txt is here.
We have a new version of AMRFinderPlus (3.2.1) compatible with this database that I encourage you to try out and let us know what issues you find. Your feedback is (almost) always appreciated. ;-)
Thanks again.
I look forward to annoy^H^H^H^H^Hhelping you in the future.
In
AMR_CDS
and some other files it is justNG_047055
But in
ReferenceGeneCatalog.txt
is isNG_047055.1