compomics / ms2rescore

Modular and user-friendly platform for AI-assisted rescoring of peptide identifications
https://ms2rescore.readthedocs.io
Apache License 2.0
39 stars 14 forks source link

Prepec modification error: Multiple modifications per site not supported in Peptide Record format. #108

Open NicolasProvencher opened 7 months ago

NicolasProvencher commented 7 months ago

Hi, I am trying to use ms2Rescore to rescore the output i get from a searchGUi / peptideshaker analysis here is the log of ms2rescore in debug mode

Running DeepLC for PSMs from run (1/1): `586APMS_FLAG_alt1`...
Calibrating DeepLC...
Using 5135 PSMs for calibration
Multiple modifications per site not supported in Peptide Record format.
Traceback (most recent call last):
  File "ms2rescore\gui\function2ctk.py", line 301, in run
    self.fn(*self.fn_args, **self.fn_kwargs)
  File "ms2rescore\gui\app.py", line 637, in function
    rescore(configuration=config)
  File "ms2rescore\core.py", line 76, in rescore
    fgen.add_features(psm_list)
  File "ms2rescore\feature_generators\deeplc.py", line 163, in add_features
    seq_df=self._psm_list_to_deeplc_peprec(psm_list_calibration)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "ms2rescore\feature_generators\deeplc.py", line 210, in _psm_list_to_deeplc_peprec
    peprec = peptide_record.to_dataframe(psm_list)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "psm_utils\io\peptide_record.py", line 505, in to_dataframe
    return pd.DataFrame([PeptideRecordWriter._psm_to_entry(psm) for psm in psm_list])
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "psm_utils\io\peptide_record.py", line 505, in <listcomp>
    return pd.DataFrame([PeptideRecordWriter._psm_to_entry(psm) for psm in psm_list])
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "psm_utils\io\peptide_record.py", line 285, in _psm_to_entry
    sequence, modifications, charge = proforma_to_peprec(psm.peptidoform)
                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "psm_utils\io\peptide_record.py", line 443, in proforma_to_peprec
    ms2pip_mods.append(_mod_to_ms2pip(mod, i + 1))
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "psm_utils\io\peptide_record.py", line 433, in _mod_to_ms2pip
    raise InvalidPeprecModificationError(
psm_utils.io.peptide_record.InvalidPeprecModificationError: Multiple modifications per site not supported in Peptide Record format.

On searchGUI i used the thermofisherrawparser to parse my raw file, X!Tandem and MS-GF+ as search algorithms and used peptideshaker In peptideshaker, i used the export->peptideshaker project as -> mzldentML to produce the mzid im using in ms2rescore (including protein sequences)

modification used are Carbomidomethyl of C as fixed, and Oxidation , Acetylation for variables

percolator was used but i dont think it got to its step

I parsed the peptides in the mzid file to make sure that for the same peptide, all the modification had a different location

here is a link to a google drive folder containing the mzid and mzml and the fasta i used https://drive.google.com/drive/folders/1DUUn7fyeJR2rgIze2dQQ3r4ikcyfwiLI?usp=sharing

here is the line in the config.json file generated by ms2rescore

`{
    "$schema": "./config_schema.json",
    "ms2rescore": {
        "feature_generators": {
            "basic": {},
            "ms2pip": {
                "model": "HCD2021",
                "ms2_tolerance": 0.02
            },
            "deeplc": {
                "deeplc_retrain": false,
                "n_epochs": 20,
                "calibration_set_size": 0.15
            }
        },
        "rescoring_engine": {
            "percolator": {
                "write_weights": true,
                "write_txt": true,
                "write_flashlfq": false,
                "protein_kwargs": {}
            }
        },
        "config_file": null,
        "psm_file": [
            "C:/Users/pron2107/Desktop/mspipeline/test1/report/test1.mzid"
        ],
        "psm_file_type": "mzid",
        "psm_reader_kwargs": {},
        "spectrum_path": "C:/Users/pron2107/Desktop/mspipeline/test1/586APMS_FLAG_alt1.mzML",
        "output_path": "C:/Users/pron2107/Desktop/mspipeline/test1/ms2rescore/test1",
        "log_level": "info",
        "id_decoy_pattern": "_REVERSED",
        "psm_id_pattern": ".*scan=(\\d+)$",
        "spectrum_id_pattern": ".*scan=(\\d+)$",
        "lower_score_is_better": false,
        "modification_mapping": {
            "Oxidation": "U:Oxidation",
            "Acetyl": "U:Acetylation"
        },
        "fixed_modifications": {
            "U:Carbamidomethyl": [
                "C"
            ]
        },
        "processes": 12,
        "rename_to_usi": false,
        "fasta_file": "C:/Users/pron2107/Desktop/mspipeline/test1/human-openprot-2_0-refprots+altprots+isoforms-min_2_pep-uniprot2022_06_011_concatenated_target_decoy.fasta",
        "write_report": true
    }
}`

after all that i am wondering if I am doing anything wrong, with my settings

Thanks in advance

Nicolas

RalfG commented 7 months ago

Hi Nicolas,

Apologies in the delay in getting back to you. Your issue seems to be a combination of 2 problems:

Let me know if it works or not.

Best, Ralf

NicolasProvencher commented 7 months ago

It works but i get another error

Extracting CNN features
Time to calculate all features: 2.843510389328003 seconds
got feature extraction results
Creating converter from 3 to 5
'NoneType' object has no attribute 'write'
Traceback (most recent call last):
  File "ms2rescore\gui\function2ctk.py", line 301, in run
    self.fn(*self.fn_args, **self.fn_kwargs)
  File "ms2rescore\gui\app.py", line 637, in function
    rescore(configuration=config)
  File "ms2rescore\core.py", line 76, in rescore
    fgen.add_features(psm_list)
  File "ms2rescore\feature_generators\deeplc.py", line 162, in add_features
    self.deeplc_predictor.calibrate_preds(
  File "deeplc\deeplc.py", line 1005, in calibrate_preds
    calibrate_output = self.calibrate_preds_func_pygam(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "deeplc\deeplc.py", line 712, in calibrate_preds_func_pygam
    predicted_tr = self.make_preds(
                   ^^^^^^^^^^^^^^^^
  File "deeplc\deeplc.py", line 667, in make_preds
    ret_preds = self.make_preds_core(X=X,
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "deeplc\deeplc.py", line 578, in make_preds_core
    ret_preds = mod.predict(
                ^^^^^^^^^^^^
  File "keras\utils\traceback_utils.py", line 70, in error_handler
  File "keras\utils\traceback_utils.py", line 67, in error_handler
  File "keras\engine\training.py", line 2403, in predict
  File "keras\callbacks.py", line 519, in on_predict_batch_end
  File "keras\callbacks.py", line 322, in _call_batch_hook
  File "keras\callbacks.py", line 345, in _call_batch_end_hook
  File "keras\callbacks.py", line 393, in _call_batch_hook_helper
  File "keras\callbacks.py", line 1101, in on_predict_batch_end
  File "keras\callbacks.py", line 1170, in _batch_update_progbar
  File "keras\utils\generic_utils.py", line 296, in update
  File "keras\utils\io_utils.py", line 79, in print_msg
AttributeError: 'NoneType' object has no attribute 'write'

This time i dont have any clue where to start looking

RalfG commented 2 months ago

Picking up on this issue again. Apologies for the long delay.

@RobbinBouwmeester, any ideas?

@NicolasProvencher, could you check if updating DeepLC fixes the issue (pip install -U deeplc). Thanks!

NicolasProvencher commented 2 months ago

Hi, ralph Hope you are doing well,

First, since I was testing your windows installation to make a protocol for a gui user, things like pip are not used.

Now, I just tried reinstalling the .exe file for version 3.0.3 and the program wont open after the installation for me to retest everything and give you more info heres the error i encounter right now'

image

if i scroll at the end i see

Click to expand ` Traceback (most recent call last): File "ms2rescore\gui\__main__.py", line 7, in from ms2rescore.gui.app import app File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module File "ms2rescore\__init__.py", line 16, in from ms2rescore.core import rescore # noqa: F401 E402 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module File "ms2rescore\core.py", line 9, in from ms2rescore.feature_generators import FEATURE_GENERATORS File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module File "ms2rescore\feature_generators\__init__.py", line 9, in from ms2rescore.feature_generators.ms2pip import MS2PIPFeatureGenerator File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module File "ms2rescore\feature_generators\ms2pip.py", line 34, in from ms2pip import correlate File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module File "ms2pip\__init__.py", line 13, in from ms2pip.core import ( # noqa: F401 E402 File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module File "ms2pip\core.py", line 25, in from ms2pip._utils.xgb_models import get_predictions_xgb, validate_requested_xgb_model File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module File "ms2pip\_utils\xgb_models.py", line 10, in import xgboost as xgb File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module File "xgboost\__init__.py", line 7, in from . import collective, dask, rabit File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module File "xgboost\dask.py", line 77, in from .sklearn import ( File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module File "xgboost\sklearn.py", line 22, in from scipy.special import softmax File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module File "scipy\special\__init__.py", line 777, in File "scipy\\special\\_ufuncs.pyx", line 1, in init scipy.special._ufuncs ModuleNotFoundError: No module named 'scipy.special._cdflib'`

If youd like me to open a different issue for this particuliar issue, just ask

If testing and fixing the windows gui version isnt a priority right now I am ok to wait until it is

Thanks a lot

RalfG commented 2 months ago

Hi @NicolasProvencher,

Thanks for reporting! I opened a new issue here: #145

Best, Ralf

NicolasProvencher commented 1 month ago

Hi, I am following up on error given while trying to run deeplc using the windows interface. we redid a run to see if the updates that happened since the original posting fixed the problem. It seems like it did not, I included the traceback we get when it crashes.

Collapsible Section Title ``` Calibrating DeepLC... Using 6036 PSMs for calibration Start to calibrate predictions ... Ready to find the best model out of: ['C:\\Users\\jacj2401\\AppData\\Local\\Programs\\MS2Rescore\\_internal\\deeplc\\mods/full_hc_PXD005573_mcp_1fd8363d9af9dcad3be7553c39396960.hdf5'] Trying out the following model: C:\Users\jacj2401\AppData\Local\Programs\MS2Rescore\_internal\deeplc\mods/full_hc_PXD005573_mcp_1fd8363d9af9dcad3be7553c39396960.hdf5 Extracting features for the CNN model ... prepare feature extraction start feature extraction wait for feature extraction get feature extraction results got feature extraction results Creating converter from 3 to 5 'NoneType' object has no attribute 'write' Traceback (most recent call last): File "ms2rescore\gui\function2ctk.py", line 301, in run self.fn(*self.fn_args, **self.fn_kwargs) File "ms2rescore\gui\app.py", line 638, in function rescore(configuration=config) File "ms2rescore\core.py", line 80, in rescore fgen.add_features(psm_list) File "ms2rescore\feature_generators\deeplc.py", line 162, in add_features self.deeplc_predictor.calibrate_preds( File "deeplc\deeplc.py", line 1173, in calibrate_preds calibrate_output = self.calibrate_preds_func_pygam( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "deeplc\deeplc.py", line 830, in calibrate_preds_func_pygam predicted_tr = self.make_preds(psm_list, calibrate=False, mod_name=mod_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "deeplc\deeplc.py", line 750, in make_preds ret_preds = self.make_preds_core( ^^^^^^^^^^^^^^^^^^^^^ File "deeplc\deeplc.py", line 621, in make_preds_core ret_preds = mod.predict( ^^^^^^^^^^^^ File "keras\utils\traceback_utils.py", line 70, in error_handler File "keras\utils\traceback_utils.py", line 67, in error_handler File "keras\engine\training.py", line 2403, in predict File "keras\callbacks.py", line 519, in on_predict_batch_end File "keras\callbacks.py", line 322, in _call_batch_hook File "keras\callbacks.py", line 345, in _call_batch_end_hook File "keras\callbacks.py", line 393, in _call_batch_hook_helper File "keras\callbacks.py", line 1101, in on_predict_batch_end File "keras\callbacks.py", line 1170, in _batch_update_progbar File "keras\utils\generic_utils.py", line 296, in update File "keras\utils\io_utils.py", line 79, in print_msg AttributeError: 'NoneType' object has no attribute **'write'** ```

If the windows version of ms2rescore isnt a priority right now feel free to bump down the importance of this issues since we are not actively trying to use it right now.

Best Nicolas