phac-nml / staramr

Scans genome contigs against the ResFinder, PlasmidFinder, and PointFinder databases.
Apache License 2.0
111 stars 25 forks source link

staramr fails with "KeyError: 'Predicted Phenotype'" #115

Closed apetkau closed 3 years ago

apetkau commented 4 years ago

When running staramr on any genome, I get the following error:

2020-02-12 16:04:35,752 INFO: Scheduling blasts for SRR1952908.fasta
2020-02-12 16:04:36,591 ERROR: 'Predicted Phenotype'
...
  File "pandas/_libs/index.pyx", line 90, in pandas._libs.index.IndexEngine.get_value
  File "pandas/_libs/index.pyx", line 133, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 157, in pandas._libs.index.IndexEngine._get_loc_duplicates
KeyError: 'Predicted Phenotype'

This looks to be due to an issue with recent pandas library versions. Downgrading this to 0.25.3 works (e.g., with conda install pandas==0.25.3).

The staramr code should likely be updated to support more recent pandas versions.

kapsakcj commented 4 years ago

Hey @apetkau - just FYI I hit this issue as well with my docker image for staramr 0.7.1. pip3 install staramr==0.7.1 installs pandas 1.0.3

when I tested staramr I hit a similar error to what you described above

...
2020-05-04 20:39:48 INFO: Scheduling blasts and MLST for contigs.fasta
2020-05-04 20:39:51 WARNING: No drug found for organism=salmonella, gene=parC, position=57
2020-05-04 20:39:51 ERROR: 'Predicted Phenotype'
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py", line 4410, in get_value
    return libindex.get_value_at(s, key)
  File "pandas/_libs/index.pyx", line 44, in pandas._libs.index.get_value_at
  File "pandas/_libs/index.pyx", line 45, in pandas._libs.index.get_value_at
  File "pandas/_libs/util.pxd", line 98, in pandas._libs.util.get_value_at
  File "pandas/_libs/util.pxd", line 83, in pandas._libs.util.validate_indexer
TypeError: 'str' object cannot be interpreted as an integer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
...

problem resolved when I downgraded to pandas 0.25.3 inside my container

I also tested the biocontainer available on quay.io (quay.io/biocontainers/staramr:0.7.1--py_1), which worked fine with no errors. That container has pandas 0.25.3

Hope this helps! and thanks for your help earlier today with the pypi package!

apetkau commented 4 years ago

Thanks for that. Yes, this is still an issue that I haven't had a chance to fix yet.

kapsakcj commented 4 years ago

No worries. Just wanted to make you aware in case it would help future development.

It might be worth adding something to the dependencies list and/or install instructions in the README to specify the compatible pandas versions.

apetkau commented 4 years ago

With regards to this issue note that I have updated the bioconda packages for version 0.7.1 and 0.4.0 so that the correct version of pandas gets installed. This would still be an issue if installing via pip and I have not had a chance to fix the issue so it works in newer versions of pandas.

javiertognarelli commented 3 years ago

Hi @apetkau I just got stuck with this issue and debugging the error I found this fixed for me (using pandas 1.0.5):

  1. file AMRDetectionSummaryResistance.py line 25: flattened_phenotype_list = [y.strip() for x in dataframe.get('Predicted Phenotype').tolist() for y in x.split(self.SEPARATOR)]

  2. file AMRDetectionSummary.py line 45: lambda x: {'Gene': (self.SEPARATOR + ' ').join(x.get('Gene'))})

It seems pandas 1.0.5 sometimes didn't like df['key'] instead df.get('key').

Best regards,

apetkau commented 3 years ago

That's awesome. Thanks so much @javiertognarelli 😄. We can incorporate this fix in and release an update (or you can submit a pull request with fixed code if you want).

javiertognarelli commented 3 years ago

I'm glad you like it. I'm still new with github and using staramr from conda so I guess it'd be better you do it. Cheers.

apetkau commented 3 years ago

@javiertognarelli sounds great. Thanks so much 😄

apetkau commented 3 years ago

Fixed in https://github.com/phac-nml/staramr/pull/126

ValentinCledassou commented 2 years ago

Hello, I have the same problem with Galaxy.org (Galaxy Version 0.5.1) and Galaxy.eu (Galaxy Version 0.7.2+galaxy0) with staramr ("KeyError: 'Predicted Phenotype")