phac-nml / staramr

Scans genome contigs against the ResFinder, PlasmidFinder, and PointFinder databases.
Apache License 2.0
111 stars 25 forks source link

Unknown mutations in PointFinder? #121

Closed ireneortega closed 1 year ago

ireneortega commented 4 years ago

Why unknown mutations are not found using PointFinder database? I would be very grateful if you can include this option too as it is the main reason I would like to use starmar.

Thanks.

apetkau commented 4 years ago

Hello @ireneortega,

Which unknown mutations are these?

One possible reason is that the PointFinder database has been updated away from the default version. You can check the PointFinder database version by running staramr db info or by looking at the settings.txt output file. The default database versions should match these https://github.com/phac-nml/staramr/blob/master/staramr/databases/AMRDatabasesManager.py#L14.

The file that maintains a link between mutations and the resistance phenotype (drug class) is maintained independently of the ResFinder/PointFinder databases, so updating the databases can include many mutations which do not match the files maintained by staramr.

Aaron

ireneortega commented 4 years ago

When running PointFinder software, mutations are sorted in known and unknown mutations. You can see here an example:

Mutation    Nucleotide change   Amino acid change   Resistance  PMID
gyrA p.T86I ACA -> ATA  T -> I  Nalidixic acid, Ciprofloxacin   11266291
gyrA p.R285K    AGG -> AAG  R -> K  Unknown -
23S r.296C>G    C -> G  RNA mutations   Unknown -
23S r.298G>A    G -> A  RNA mutations   Unknown -
23S r.327G>A    G -> A  RNA mutations   Unknown -
23S r.364G>C    G -> C  RNA mutations   Unknown -
23S r.554A>C    A -> C  RNA mutations   Unknown -
23S r.571T>G    T -> G  RNA mutations   Unknown -
23S r.1027A>G   A -> G  RNA mutations   Unknown -
23S r.1752T>C   T -> C  RNA mutations   Unknown -

Updating PointFinder database today, staramr only finds the first one, which corresponds to a known mutation:

Isolate ID  Gene    Predicted Phenotype Type    Position    Mutation    %Identity   %Overlap    HSP Length/Total Length Contig  Start   End
A   gyrA (T86I) ciprofloxacin I/R, nalidixic acid   codon   86  ACA -> ATA (T -> I) 99.31   100.00  2592/2592   C_1_length_377804   38013   35422

I would like to get same above results. Could it be possible?

Although users can update databases as it is explained, I encourage you to do it to in default version to keep staramr updated. ResFinder/PointFinder/PlasmidFinder databases hasn't been updated since 2018. Maybe users would be unaware of that.

apetkau commented 4 years ago

Thanks. Which organism and which version of staramr?

At this current moment in time it's likely not possible to reproduce those same results from the PointFinder software as it's likely something has changed either in the PointFinder software or database to produce those results.

We will keep in mind updating the databases used by staramr, however at the current moment we do not have much for resources to dedicate to this project. I will keep this issue open though until we can work on it again.

ireneortega commented 4 years ago

Campylobacter and staramr 0.5.1. I couldn't install last version with conda install -c bioconda staramr==0.7.1 as it gave me an error.

Thanks for your collaboration in developing staramr!

apetkau commented 4 years ago

Hmmm... Which error was it? Was it this one (https://github.com/phac-nml/staramr/issues/115)?

No problem. Thank you for reporting these issues.

ireneortega commented 4 years ago

No, I already solved that problem in fact. It was something related to staramr installation using conda. The only command that worked for me was:

conda install -c bioconda staramr pandas==0.25.3

If I specify staramr==0.7.1, I get a conda error that I can't remember well. I think it was due to conflict issues with other packages.

apetkau commented 4 years ago

Thanks for reporting the info about the install error @ireneortega. If you do end up encountering it again let me know. I tested it out myself but I don't encounter any error (maybe it was related to some conda package that has since been fixed).

apetkau commented 1 year ago

I am closing this issue.