tibbdc / ECMpy

MIT License
10 stars 3 forks source link

get_reaction_kcat_mw() resulting in KeyError: 'kcat_MW' #9

Open mheydasch opened 2 months ago

mheydasch commented 2 months ago

I'm running the scripts in their default state and get stuck at step 9 in notebook 01.get_reaction_kcat_using_AutoPACMEN.ipynb when running get_reaction_kcat_mw(model,autopacmen_folder, project_name, reaction_gap_fill,gene_subnum_path,reaction_kcat_mw_path)

{
    "name": "KeyError",
    "message": "'kcat_MW'",
    "stack": "---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/miniconda3/envs/ECMpy2/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3360             try:
-> 3361                 return self._engine.get_loc(casted_key)
   3362             except KeyError as err:

~/miniconda3/envs/ECMpy2/lib/python3.7/site-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

~/miniconda3/envs/ECMpy2/lib/python3.7/site-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'kcat_MW'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
/tmp/ipykernel_2323/2930141894.py in <module>
    202     #display(reaction_kcat_mw_df_T)
    203 
--> 204 get_reaction_kcat_mw_debug(model,autopacmen_folder, project_name, reaction_gap_fill,gene_subnum_path,reaction_kcat_mw_path)
    205 
    206 

/tmp/ipykernel_2323/2930141894.py in get_reaction_kcat_mw_debug(model, project_folder, project_name, type_of_default_kcat_selection, enzyme_unit_number_file, json_output_file)
    183     print('model_reaction_id:', model_reaction_id)
    184     print('reaction_mw: ',reaction_mw)
--> 185     reaction_kcat_mw_df_T_select=reaction_kcat_mw_df_T[abs(reaction_kcat_mw_df_T.loc['kcat_MW'])>0]
    186     reaction_kcat_mw_df_T_select.to_csv(json_output_file)
    187 

~/miniconda3/envs/ECMpy2/lib/python3.7/site-packages/pandas/core/indexing.py in __getitem__(self, key)
    929 
    930             maybe_callable = com.apply_if_callable(key, self.obj)
--> 931             return self._getitem_axis(maybe_callable, axis=axis)
    932 
    933     def _is_scalar_access(self, key: tuple):

~/miniconda3/envs/ECMpy2/lib/python3.7/site-packages/pandas/core/indexing.py in _getitem_axis(self, key, axis)
   1162         # fall thru to straight lookup
   1163         self._validate_key(key, axis)
-> 1164         return self._get_label(key, axis=axis)
   1165 
   1166     def _get_slice_axis(self, slice_obj: slice, axis: int):

~/miniconda3/envs/ECMpy2/lib/python3.7/site-packages/pandas/core/indexing.py in _get_label(self, label, axis)
   1111     def _get_label(self, label, axis: int):
   1112         # GH#5667 this will fail if the label is not present in the axis.
-> 1113         return self.obj.xs(label, axis=axis)
   1114 
   1115     def _handle_lowerdim_multi_index_axis0(self, tup: tuple):

~/miniconda3/envs/ECMpy2/lib/python3.7/site-packages/pandas/core/generic.py in xs(self, key, axis, level, drop_level)
   3774                 raise TypeError(f\"Expected label or tuple of labels, got {key}\") from e
   3775         else:
-> 3776             loc = index.get_loc(key)
   3777 
   3778             if isinstance(loc, np.ndarray):

~/miniconda3/envs/ECMpy2/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3361                 return self._engine.get_loc(casted_key)
   3362             except KeyError as err:
-> 3363                 raise KeyError(key) from err
   3364 
   3365         if is_scalar(key) and isna(key) and not self.hasnans:

KeyError: 'kcat_MW'"
}

I added the following lines to the function and executed it separately to trouble shoot the issue:

    print('eachgene: ', eachgene)
    print('and in gene reaction rule: ', re.search(' and ',reaction.gene_reaction_rule))
    print('eachgene in protein mapping keys: ', eachgene in protein_id_mass_mapping.keys())
    print('ID in keys: ', model_reaction_id in reaction_mw.keys())
    print('reaction_kcat:  ', reaction_kcat)
    print('model_reaction_id:', model_reaction_id)
    print('reaction_mw: ',reaction_mw)
    reaction_kcat_mw_df_T_select=reaction_kcat_mw_df_T[abs(reaction_kcat_mw_df_T.loc['kcat_MW'])>0]
    reaction_kcat_mw_df_T_select.to_csv(json_output_file)

output was the following:


Default kcat is: 4425.788565013427
eachgene:  b1377
and in gene reaction rule:  None
eachgene in protein mapping keys:  False
ID in keys:  False
reaction_kcat:   4425.788565013427
model_reaction_id: AI2tex_reverse_num4
reaction_mw:  {}
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/miniconda3/envs/ECMpy2/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3360             try:
-> 3361                 return self._engine.get_loc(casted_key)
   3362             except KeyError as err:
maozhitao commented 3 weeks ago

The AutoPACMEN process depends on internet speed. The error you encountered seems to stem from the inability to retrieve the molecular weight (MW) of the protein from UniProt due to network issues, which led to the failure in calculating the kcat_MW for the corresponding reaction of that protein.