bowmanjeffs / paprica

paprica - PAthway PRediction by phylogenetIC plAcement
26 stars 8 forks source link

No output of Pathway and SUM_pathway file #101

Closed angelwyc closed 1 week ago

angelwyc commented 5 months ago

Dear Jeff,

Good day to you, I am writing to enquire regarding a problem with the PAPRICA package where I do not have an output of the csv files such as bacteria.ec, edge_data, pathway, sum_pathway, unique_seq. I have the PAPRICA package running well for most of my samples but some of them does not output the files as i mentioned. This is the message that I have received for the samples that does not have the output of the files mentioned.

Traceback (most recent call last): File "/ibm/gpfs/home/rsadeq/angel/paprica/paprica/./paprica-tally_pathways.py", line 390, in unique_csv.loc[unique, 'abundance_corrected'] = unique_csv.loc[unique, 'abundance'] / float(n16S) ^^^^^^^^^^^ File "/home/rsadeq/.conda/envs/paprica_new/lib/python3.12/site-packages/pandas/core/series.py", line 230, in wrapper raise TypeError(f"cannot convert the series to {converter}") TypeError: cannot convert the series to <class 'float'>

Would greatly appreciate it if you could advice on how should I troubleshoot or proceed with this. Thank you. :)

bowmanjeffs commented 5 months ago

Can you send me one of the files that's failing and I'll take a look?

angelwyc commented 1 week ago

Dear Jeff,

I am so sorry, I've missed your message! Somehow I have managed to fix it by updating pandas to the latest version.

However, I somehow can't compile the data using the line of code from your tutorial: paprica-combine_results.py -domain bacteria -o 20240626out

This is the error code that I have received: Traceback (most recent call last): File "/home/rsadeq/.local/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3805, in get_loc return self._engine.get_loc(casted_key) File "index.pyx", line 167, in pandas._libs.index.IndexEngine.get_loc File "index.pyx", line 196, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 7081, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 7089, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'nedge_corrected'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/ibm/gpfs/home/rsadeq/angel/new/paprica/./paprica-combine_results.py", line 186, in edge_data.loc[name, param + '.mean'], edge_data.loc[name, param + '.sd'] = fill_edge_data(param, name, temp_edge) File "/ibm/gpfs/home/rsadeq/angel/new/paprica/./paprica-combine_results.py", line 140, in fill_edge_data n = list(range(int(math.ceil(df_in.loc[index, 'nedge_corrected'])))) File "/home/rsadeq/.local/lib/python3.9/site-packages/pandas/core/indexing.py", line 1183, in getitem return self.obj._get_value(*key, takeable=self._takeable) File "/home/rsadeq/.local/lib/python3.9/site-packages/pandas/core/frame.py", line 4214, in _get_value series = self._get_item_cache(col) File "/home/rsadeq/.local/lib/python3.9/site-packages/pandas/core/frame.py", line 4638, in _get_item_cache loc = self.columns.get_loc(item) File "/home/rsadeq/.local/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3812, in get_loc raise KeyError(key) from err KeyError: 'nedge_corrected'

Any advice? I have updated pandas to the latest version of 2.2.2 and it would still not work. Was wondering if you have encountered this error before?

Thank you! :)

bowmanjeffs commented 1 week ago

I need to fix that! It's trying to include an old output file in the aggregation of results and fails because the columns are different. Purge any old output files and it should work.

angelwyc commented 1 week ago

Oh! Thank you so much, it is working now :D