Closed vappiah closed 1 year ago
I realized the problem was pandas. The pandas version 2.0.1 was giving issues. But when I downgraded 1.5.3 it worked. Just a warning message was displayed
/home/bioinfocoach/apps/mamba/envs/staramr/lib/python3.11/site-packages/staramr/subcommand/Search.py:544: FutureWarning: DataFrame.set_axis 'inplace' keyword is deprecated and will be removed in a future version. Use obj = obj.set_axis(..., copy=False)
instead
settings_dataframe.set_axis(['Value'], axis='columns', inplace=True)
/home/bioinfocoach/apps/mamba/envs/staramr/lib/python3.11/site-packages/staramr/subcommand/Search.py:200: FutureWarning: save is not part of the public API, usage can give unexpected results and will be removed in a future version
writer.save()
That's for reporting this issue. We will have to fix for our next release to make sure it's compatible with pandas >= 2
I'm not convinced that this particular error is directly related to the version of Pandas. The original error message is happening in a method in the ARGDrugTable
(here):
def _drug_string_to_correct_separators(self, drug):
"""
Converts a drug string (separated by commas) to use correct separators/spacing.
:param drug: The drug string.
:return: The drug string with correct separators/spacing.
"""
return ', '.join(drug.split(','))
Basically, the method is attempting to take the drug (phenotype) and replace ,
characters with ,
(if they exist). It's failing in the original error message because drug
isn't a String
, but a float
, which suggests that there was a problem with one of the entries in the ARG drug table file. Maybe one of the entries in the drug column is a float
, or something loading the table didn't work correct.
I do see some entries are None
, so maybe there's something going on in different versions of Pandas when loading that up, so I'm going to add a bit of safety checking soon to hopefully prevent similar errors in the future.
It looks like the specific issue with this is that by default in pandas<2, "None" (which appears in the ARG drug table) is loaded as a String object. However, by default in pandas>2, "None" is loaded as a pd.NA value.
This causes problems when trying to parse the strings in the function I mentioned previously, because one is a string and the other is not. I'm going to try to resolve the issue by always loading "None" entries as pd.NA values for this particular table.
Fixed in #194
Dear Developers,
I installed staramr (0.9.1) on an ubuntu 20.04 system using mamba When i try to run staramr (I am following the tutorial on your github page)
I get this error message. Please advice
No --plasmidfinder-database-type specified. Will search the entire PlasmidFinder database 2023-04-24 21:56:59 INFO: --output-dir set. All files will be output to [output] 2023-04-24 21:56:59 INFO: Will exclude ResFinder/PointFinder genes listed in [/home/bioinfocoach/apps/mamba/envs/staramr/lib/python3.11/site-packages/staramr/databases/exclude/data/genes_to_exclude.tsv]. Use --no-exclude-genes to disable 2023-04-24 21:56:59 INFO: Making BLAST databases for input files 2023-04-24 21:57:00 INFO: Scheduling blasts and MLST for isolate1.fasta 2023-04-24 21:57:00 INFO: Scheduling blasts and MLST for isolate2.fasta 2023-04-24 21:57:20 ERROR: 'float' object has no attribute 'split' Traceback (most recent call last): File "/home/bioinfocoach/apps/mamba/envs/staramr/bin/staramr", line 68, in
args.run_command(args)
File "/home/bioinfocoach/apps/mamba/envs/staramr/lib/python3.11/site-packages/staramr/subcommand/Search.py", line 467, in run
results = self._generate_results(database_repos=database_repos,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bioinfocoach/apps/mamba/envs/staramr/lib/python3.11/site-packages/staramr/subcommand/Search.py", line 287, in _generate_results
amr_detection.run_amr_detection(files,pid_threshold, plength_threshold_resfinder,
File "/home/bioinfocoach/apps/mamba/envs/staramr/lib/python3.11/site-packages/staramr/detection/AMRDetection.py", line 194, in run_amr_detection
self._pointfinder_dataframe = self._create_pointfinder_dataframe(pointfinder_blast_map, pid_threshold,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bioinfocoach/apps/mamba/envs/staramr/lib/python3.11/site-packages/staramr/detection/AMRDetectionResistance.py", line 56, in _create_pointfinder_dataframe
return pointfinder_parser.parse_results()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bioinfocoach/apps/mamba/envs/staramr/lib/python3.11/site-packages/staramr/blast/results/BlastResultsParser.py", line 67, in parse_results
self._handle_blast_hit(file, database_name, blast_out, results, hit_seq_records)
File "/home/bioinfocoach/apps/mamba/envs/staramr/lib/python3.11/site-packages/staramr/blast/results/BlastResultsParser.py", line 109, in _handle_blast_hit
blast_results = self._get_result_rows(hit, database_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bioinfocoach/apps/mamba/envs/staramr/lib/python3.11/site-packages/staramr/blast/results/pointfinder/BlastResultsParserPointfinder.py", line 98, in _get_result_rows
results.append(self._get_result(hit, db_mutation))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bioinfocoach/apps/mamba/envs/staramr/lib/python3.11/site-packages/staramr/blast/results/pointfinder/BlastResultsParserPointfinderResistance.py", line 55, in _get_result
drug = self._arg_drug_table.get_drug(self._blast_database.get_organism(), hit.get_amr_gene_id(),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bioinfocoach/apps/mamba/envs/staramr/lib/python3.11/site-packages/staramr/databases/resistance/pointfinder/ARGDrugTablePointfinder.py", line 40, in get_drug
return self._drug_string_to_correct_separators(drug.iloc[0])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bioinfocoach/apps/mamba/envs/staramr/lib/python3.11/site-packages/staramr/databases/resistance/ARGDrugTable.py", line 44, in _drug_string_to_correct_separators
return ', '.join(drug.split(','))
^^^^^^^^^^
AttributeError: 'float' object has no attribute 'split'