DorresteinLaboratory / NAP_ProteoSAFe

Other
7 stars 5 forks source link

Issue in Formatting a Custom Database for NAP - Webserver and Jupyter Notebook #7

Open andrekind17 opened 9 months ago

andrekind17 commented 9 months ago

Hello, I am having problems in formatting a custom database for a NAP job (find the database attached).

I get the same error both on the webserver at http://seriema.fcfrp.usp.br:5002/upload and by using the jupyter notebook as described at https://github.com/DorresteinLaboratory/NAP_ProteoSAFe/?tab=readme-ov-file#jupyter-notebook-for-developers

Specifically, the error I get from the webserver says: Your task appear to have failed here is the error: "None of [Index(['inchikey', 'kingdom', 'superclass', 'class', 'subclass'], dtype='object')] are in the [columns]"

While, on jupyter notebook I manage to run the first cells but then I get an error referred to the cell:

classy = classy[['inchikey', 'kingdom', 'superclass', 'class', 'subclass']]
classy.columns = ['inchikey', 'kingdom_name', 'superclass_name', 'class_name', 'subclass_name']

I think there might be a problem related to the script.

The error log is the following:


KeyError Traceback (most recent call last) Cell In[35], line 1 ----> 1 classy = classy[['inchikey', 'kingdom', 'superclass', 'class', 'subclass']] 2 classy.columns = ['inchikey', 'kingdom_name', 'superclass_name', 'class_name', 'subclass_name']

File ~/mambaforge/envs/NAP_formatdb/lib/python3.12/site-packages/pandas/core/frame.py:3899, in DataFrame.getitem(self, key) 3897 if is_iterator(key): 3898 key = list(key) -> 3899 indexer = self.columns._get_indexer_strict(key, "columns")[1] 3901 # take() does not accept boolean indexers 3902 if getattr(indexer, "dtype", None) == bool:

File ~/mambaforge/envs/NAP_formatdb/lib/python3.12/site-packages/pandas/core/indexes/base.py:6115, in Index._get_indexer_strict(self, key, axis_name) 6112 else: 6113 keyarr, indexer, new_indexer = self._reindex_non_unique(keyarr) -> 6115 self._raise_if_missing(keyarr, indexer, axis_name) 6117 keyarr = self.take(indexer) 6118 if isinstance(key, Index): 6119 # GH 42790 - Preserve name from an Index

File ~/mambaforge/envs/NAP_formatdb/lib/python3.12/site-packages/pandas/core/indexes/base.py:6176, in Index._raise_if_missing(self, key, indexer, axis_name) 6174 if use_interval_msg: 6175 key = list(key) -> 6176 raise KeyError(f"None of [{key}] are in the [{axis_name}]") 6178 not_found = list(ensure_index(key)[missing_mask.nonzero()[0]].unique()) 6179 raise KeyError(f"{not_found} not in index")

KeyError: "None of [Index(['inchikey', 'kingdom', 'superclass', 'class', 'subclass'], dtype='object')] are in the [columns]"

Can you help me with this?

Db_Test.txt