phac-nml / staramr

Scans genome contigs against the ResFinder, PlasmidFinder, and PointFinder databases.
Apache License 2.0
120 stars 26 forks source link

Coerce PMID column into string not integer #212

Open sgsutcliffe opened 4 months ago

sgsutcliffe commented 4 months ago

In older versions of pandas (e.g. 2.1.0) the PMID column of Pointfinder Database was being converted to integer not the expected string. For example Klebsiella, was causing an error

File "lib/python3.11/site-packages/staramr/blast/pointfinder/PointfinderBlastDatabase.py", line 128, in get_cge_pmid
    return self._pointfinder_info.get_value(gene, mutation, "PMID")
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lib/python3.11/site-packages/staramr/blast/pointfinder/PointfinderDatabaseInfo.py", line 206, in get_value
    results = ';'.join(matches[attribute])
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: sequence item 0: expected str instance, int found
sgsutcliffe commented 4 months ago

We will fix the error by coercing the PMID column into a string to avoid this error.