kusterlab / prosit

Prosit offers high quality MS2 predicted spectra for any organism and protease as well as iRT prediction. When using Prosit is helpful for your research, please cite "Gessulat, Schmidt et al. 2019" DOI 10.1038/s41592-019-0426-7
https://www.proteomicsdb.org/prosit/
Apache License 2.0
85 stars 45 forks source link

TMT model files not compatible with Prost codes provided here #100

Open cctsou opened 1 year ago

cctsou commented 1 year ago

I was able to build the server with TMT model files, but then I encountered the following error by running a small peptide list as a csv file:

modified_sequence,collision_energy,precursor_charge,fragmentation ALNNLPALQAM(ox)TLALNR,35,2,HCD EAAALLDDCIFNM(ox)VLLK,35,3,CID DPLSSYNIIAWDWNGPK,35,2,HCD KTDCCILSALLFQGLLR,35,3,CID

Error message `[2023-02-22 18:35:16,739] ERROR in app: Exception on /predict/msp [POST] Traceback (most recent call last): File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 2447, in wsgi_app response = self.full_dispatch_request() File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1952, in full_dispatch_request rv = self.handle_user_exception(e) File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1821, in handle_user_exception reraise(exc_type, exc_value, tb) File "/usr/local/lib/python3.5/dist-packages/flask/_compat.py", line 39, in reraise raise value File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1950, in full_dispatch_request rv = self.dispatch_request() File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1936, in dispatch_request return self.view_functions[rule.endpoint](req.view_args) File "/root/prosit/server.py", line 51, in return_msp result = predict(flask.request.files["peptides"]) File "/root/prosit/server.py", line 29, in predict data = prediction.predict(data, d_spectra) File "/root/prosit/prediction.py", line 13, in predict x = io_local.get_array(data, d_model["config"]["x"]) File "/root/prosit/io_local.py", line 5, in get_array utils.check_mandatory_keys(tensor, keys) File "/root/prosit/utils.py", line 7, in check_mandatory_keys raise KeyError("key {} is missing".format(key)) KeyError: 'key fragmentation is missing' `**

I believe that the error was because "fragmentation" was not parsed in the input data frame, I tried adding "fragmentation" into the csv parsing function below but I do not know how fragmentation is encoded. Could you please help? Could you provide the Prosit codes that are fully compatible with the TMT models you provided?

def csv(df): df.reset_index(drop=True, inplace=True) assert "modified_sequence" in df.columns assert "collision_energy" in df.columns assert "precursor_charge" in df.columns data = { "collision_energy_aligned_normed": get_numbers(df.collision_energy) / 100.0, "sequence_integer": get_sequence_integer(df.modified_sequence), "fragmentation": df.fragmentation, "precursor_charge_onehot": get_precursor_charge_onehot(df.precursor_charge), "masses_pred": get_mz_applied(df), }

Originally posted by @cctsou in https://github.com/kusterlab/prosit/issues/84#issuecomment-1440602995