compomics / DeepLC

DeepLC: Retention time prediction for (modified) peptides using Deep Learning.
https://iomics.ugent.be/deeplc
Apache License 2.0
52 stars 18 forks source link

Error during calibration using modified peptides #56

Closed steffenlem closed 1 year ago

steffenlem commented 1 year ago

Hi, we tried to calibrate DeepLC using a list of peptides, some of which had modifications.

This modification caused the crash: "Label:13C(5)15N(1)" it is defined in Unimod as well as in your unimod_to_formula.csv

Traceback (most recent call last):
  File "/home/user/deeplc/deeplc_cli.py", line 333, in <module>
    sys.exit(main())
  File "/home-link/user/mambaforge/envs/deeplc/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home-link/user/mambaforge/envs/deeplc/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home-link/user/mambaforge/envs/deeplc/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home-link/user/mambaforge/envs/deeplc/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/user/deeplc/deeplc_cli.py", line 304, in main
    df_deeplc_output = run_deeplc(df_deeplc_input, calibration_df)
  File "/home/user/deeplc/deeplc_cli.py", line 189, in run_deeplc
    dlc.calibrate_preds(seq_df=calibration_df)
  File "/home-link/user/mambaforge/envs/deeplc/lib/python3.10/site-packages/deeplc/deeplc.py", line 986, in calibrate_preds
    calibrate_output = self.calibrate_preds_func_pygam(
  File "/home-link/user/mambaforge/envs/deeplc/lib/python3.10/site-packages/deeplc/deeplc.py", line 697, in calibrate_preds_func_pygam
    predicted_tr = self.make_preds(
  File "/home-link/user/mambaforge/envs/deeplc/lib/python3.10/site-packages/deeplc/deeplc.py", line 633, in make_preds
    X = self.do_f_extraction_psm_list_parallel(psm_list)
  File "/home-link/user/mambaforge/envs/deeplc/lib/python3.10/site-packages/deeplc/deeplc.py", line 442, in do_f_extraction_psm_list_parallel
    all_feats = self.do_f_extraction_psm_list(psm_list)
  File "/home-link/user/mambaforge/envs/deeplc/lib/python3.10/site-packages/deeplc/deeplc.py", line 408, in do_f_extraction_psm_list
    return self.f_extractor.full_feat_extract(psm_list)
  File "/home-link/user/mambaforge/envs/deeplc/lib/python3.10/site-packages/deeplc/feat_extractor.py", line 664, in full_feat_extract
    X_cnn = self.encode_atoms( # X_sum, X_cnn_pos, X_cnn_count, X_hc
  File "/home-link/user/mambaforge/envs/deeplc/lib/python3.10/site-packages/deeplc/feat_extractor.py", line 549, in encode_atoms
    matrix[i, dict_index[atom_position_composition]] += atom_change
KeyError: 'C[13]'

This is the data frame we used for the calibration

seq modifications   tr
VVKKHIKEL       592.3
KLSEVNKRL       893.4
FHHGLGHSL       999.0
HIKTHELHL       1109.4
HRTEFYRNL       1344.6
KLQEKIQEL       1519.0
LEHEHLIKL       1624.1
DRHSFLKAL       1780.7
IEQEQKLAL       1852.3
HHSLIRISL       1963.5
TETVHIFKL       2200.0
IESSDVIRL       2255.5
TGLIRPVAL       2392.4
IGDGYVIHL       2527.9
LPQELKLTL       2664.9
KLLQFYPSL       2806.5
DGTVRLWSL       2967.6
LMLGEFLKL   2|Oxidation 3212.2
SLLSSVFKL       3263.0
LPQLPLAAL       3473.7
SYLEDVRLI   6|Label:13C(5)15N(1)    3617.3

Best, Steffen

RobbinBouwmeester commented 1 year ago

Hi Steffen,

What version of DeepLC are you running? I think this should not be an issue with the psm_utils integration, but who knows :).

Kind regards,

Robbin

steffenlem commented 1 year ago

This was with version 2.2.0

RobbinBouwmeester commented 1 year ago

Ok I will have a look tomorrow. I have a fix in mind, so also hope to release a new version with the fix tomorrow.

RobbinBouwmeester commented 1 year ago

This should be fixed since v2.2.2, I will close this issue for now. Feel free to reopen it if needed.