compomics / ms2rescore

Modular and user-friendly platform for AI-assisted rescoring of peptide identifications
https://ms2rescore.readthedocs.io
Apache License 2.0
39 stars 14 forks source link

Cannot resolve mass for modification #107

Closed MatteoLacki closed 3 months ago

MatteoLacki commented 7 months ago

Hello,

likely it is something minor, but I simply don't know how to solve it:

I get this error:

MS²Rescore (v3.0.0b5)
Developed at CompOmics, VIB / Ghent University, Belgium.
Please cite: Declercq et al. MCP (2022)

2023-11-10 14:01:30 DEBUG    ms2rescore.core // Using 16 of 16 available CPUs.

                    INFO     ms2rescore.parse_psms // Reading PSMs from file...

                    INFO     ms2rescore.parse_psms // Reading PSMs from PSM file (1/1):

                             `partial/G8027/G8045/MS1@tims@29f95c3131e@default@fast@default@MS2@tims@29f95c3131e@default@fast@d
efault/matcher
                             @prtree@narrow/rough@default/1stSearch@sage@95c2993@p12f15nd/fasta@3/results.sage.filtered@default
.mapback@defau
                             lt/mz_recalibrated_distributions@xgboost@sageConfig@p12f15nd.ppms@c2_c98.fasta@3/results.sage.tsv`
...
2023-11-10 14:01:33 DEBUG    ms2rescore.parse_psms // Finding decoys...

                    INFO     ms2rescore.parse_psms // Found 19479 PSMs, of which 37.81% are decoys.

                    DEBUG    ms2rescore.parse_psms // Parsing modifications...

                    DEBUG    ms2rescore.parse_psms // Found modifications: {'+57.0216', '+15.9949', '+42'}

2023-11-10 14:01:34 INFO     ms2rescore.core // Found 4240 identified PSMs at 1% FDR before rescoring.

                    DEBUG    ms2rescore.core // PSMs already contain the following rescoring features: {'peptide_len', 'hypersc
ore',
                             'delta_rt_model', 'expmass', 'predicted_rt', 'missed_cleavages', 'aligned_rt', 'isotope_error', 'm
s1_intensity',
                             'matched_peaks', 'delta_next', 'matched_intensity_pct', 'fragment_ppm', 'longest_y', 'longest_y_pc
t', 'poisson',
                             'calcmass', 'delta_best', 'precursor_ppm', 'ms2_intensity', 'longest_b', 'scored_candidates'}

                    INFO     ms2rescore.feature_generators.basic // Adding basic features to PSMs.

2023-11-10 14:01:35 ERROR    ms2rescore.__main__ // Cannot resolve mass for modification 4.

                             ╭───────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────╮
                             │ /home/matteo/Projects/midia/pipelines/dev_paths/midia_pipe/software/ms2rescore/venv_ms2rescore/lib/python3.1 │
                             │ 1/site-packages/psm_utils/peptidoform.py:307 in sequential_theoretical_mass                                  │
                             │                                                                                                              │
                             │   304 │   │   │   if tags:                                                                                   │
                             │   305 │   │   │   │   for tag in tags:                                                                       │
                             │   306 │   │   │   │   │   try:                                                                               │
                             │ ❱ 307 │   │   │   │   │   │   position_mass += tag.mass                                                      │
                             │   308 │   │   │   │   │   except (AttributeError, KeyError) as e:                                            │
                             │   309 │   │   │   │   │   │   raise ModificationException(                                                   │
                             │   310 │   │   │   │   │   │   │   "Cannot resolve mass for modification " f"{tag.value}."                    │
                             │                                                                                                              │
                             │ /home/matteo/Projects/midia/pipelines/dev_paths/midia_pipe/software/ms2rescore/venv_ms2rescore/lib/python3.1 │
                             │ 1/site-packages/pyteomics/proforma.py:710 in mass                                                            │
                             │                                                                                                              │
                             │    707 │   │   Returns                                                                                       │
                             │    708 │   │   -------float                                                                                  │
                             │    709 │   │   '''                                                                                           │
                             │ ❱  710 │   │   return self.definition['mass']                                                                │
                             │    711 │                                                                                                     │
                             │    712 │   @property                                                                                         │
                             │    713 │   def composition(self):                                                                            │
                             │                                                                                                              │
                             │ /home/matteo/Projects/midia/pipelines/dev_paths/midia_pipe/software/ms2rescore/venv_ms2rescore/lib/python3.1 │
                             │ 1/site-packages/pyteomics/proforma.py:700 in definition                                                      │
                             │                                                                                                              │
                             │    697 │   │   dict                                                                                          │
                             │    698 │   │   '''                                                                                           │
                             │    699 │   │   if self._definition is None:                                                                  │
                             │ ❱  700 │   │   │   self._definition = self.resolve()                                                         │
                             │    701 │   │   return self._definition                                                                       │
                             │    702 │                                                                                                     │
                             │    703 │   @property                                                                                         │
                             │                                                                                                              │
                             │ /home/matteo/Projects/midia/pipelines/dev_paths/midia_pipe/software/ms2rescore/venv_ms2rescore/lib/python3.1 │
                             │ 1/site-packages/pyteomics/proforma.py:758 in resolve                                                         │
                             │                                                                                                              │
                             │    755 │   │   '''Find the term and return it's properties                                                   │
                             │    756 │   │   '''                                                                                           │
                             │    757 │   │   keys = self.resolver.parse_identifier(self.value)                                             │
                             │ ❱  758 │   │   return self.resolver(*keys)                                                                   │
                             │    759                                                                                                       │
                             │    760                                                                                                       │
                             │    761 class FormulaModification(ModificationBase):                                                          │
                             │                                                                                                              │
                             │ /home/matteo/Projects/midia/pipelines/dev_paths/midia_pipe/software/ms2rescore/venv_ms2rescore/lib/python3.1 │
                             │ 1/site-packages/pyteomics/proforma.py:344 in __call__                                                        │
                             │                                                                                                              │
                             │    341 │   │   raise NotImplementedError()                                                                   │
                             │    342 │                                                                                                     │
                             │    343 │   def __call__(self, name=None, id=None, **kwargs):                                                 │
                             │ ❱  344 │   │   return self.resolve(name, id, **kwargs)                                                       │
                             │    345 │                                                                                                     │
                             │    346 │   def __eq__(self, other):                                                                          │
                             │    347 │   │   return self.name == other.name                                                                │
                             │                                                                                                              │
                             │ /home/matteo/Projects/midia/pipelines/dev_paths/midia_pipe/software/ms2rescore/venv_ms2rescore/lib/python3.1 │
                             │ 1/site-packages/pyteomics/proforma.py:386 in resolve                                                         │
                             │                                                                                                              │
                             │    383 │   │   │   if not defn:                                                                              │
                             │    384 │   │   │   │   raise KeyError(name)                                                                  │
                             │    385 │   │   elif id is not None:                                                                          │
                             │ ❱  386 │   │   │   defn = self.database.by_id(id)                                                            │
                             │    387 │   │   │   if not defn:                                                                              │
                             │    388 │   │   │   │   raise KeyError(id)                                                                    │
                             │    389 │   │   else:                                                                                         │
                             ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
                             AttributeError: 'Unimod' object has no attribute 'by_id'

                             The above exception was the direct cause of the following exception:

                             ╭───────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────╮
                             │ /home/matteo/Projects/midia/pipelines/dev_paths/midia_pipe/software/ms2rescore/venv_ms2rescore/lib/python3.1 │
                             │ 1/site-packages/ms2rescore/__main__.py:178 in main                                                           │
                             │                                                                                                              │
                             │   175 │                                                                                                      │
                             │   176 │   # Run MS²Rescore                                                                                   │
                             │   177 │   try:                                                                                               │
                             │ ❱ 178 │   │   rescore(configuration=config)                                                                  │
                             │   179 │   except Exception as e:                                                                             │
                             │   180 │   │   LOGGER.exception(e)                                                                            │
                             │   181 │   │   sys.exit(1)                                                                                    │
                             │                                                                                                              │
                             │ /home/matteo/Projects/midia/pipelines/dev_paths/midia_pipe/software/ms2rescore/venv_ms2rescore/lib/python3.1 │
                             │ 1/site-packages/ms2rescore/core.py:76 in rescore                                                             │
                             │                                                                                                              │
                             │    73 │   │   conf = config.copy()                                                                           │
                             │    74 │   │   conf.update(fgen_config)                                                                       │
                             │    75 │   │   fgen = FEATURE_GENERATORS[fgen_name](**conf)                                                   │
                             │ ❱  76 │   │   fgen.add_features(psm_list)                                                                    │
                             │    77 │   │   logger.debug(f"Adding features from {fgen_name}: {set(fgen.feature_names)}")                   │
                             │    78 │   │   feature_names[fgen_name] = set(fgen.feature_names)                                             │
                             │    79                                                                                                        │
                             │                                                                                                              │
                             │ /home/matteo/Projects/midia/pipelines/dev_paths/midia_pipe/software/ms2rescore/venv_ms2rescore/lib/python3.1 │
                             │ 1/site-packages/ms2rescore/feature_generators/basic.py:70 in add_features                                    │
                             │                                                                                                              │
                             │    67 │   │   │   self._feature_names.extend(["charge_n"] + one_hot_names)                                   │
                             │    68 │   │                                                                                                  │
                             │    69 │   │   if has_mz:  # Charge also required for theoretical m/z                                         │
                             │ ❱  70 │   │   │   theo_mz = np.array([psm.peptidoform.theoretical_mz for psm in psm_list])                   │
                             │    71 │   │   │   abs_ms1_error_ppm = np.abs((precursor_mzs - theo_mz) / theo_mz * 10**6)                    │
                             │    72 │   │   │   self._feature_names.append("abs_ms1_error_ppm")                                            │
                             │    73                                                                                                        │
                             │                                                                                                              │
                             │ /home/matteo/Projects/midia/pipelines/dev_paths/midia_pipe/software/ms2rescore/venv_ms2rescore/lib/python3.1 │
                             │ 1/site-packages/ms2rescore/feature_generators/basic.py:70 in <listcomp>                                      │
                             │                                                                                                              │
                             │    67 │   │   │   self._feature_names.extend(["charge_n"] + one_hot_names)                                   │
                             │    68 │   │                                                                                                  │
                             │    69 │   │   if has_mz:  # Charge also required for theoretical m/z                                         │
                             │ ❱  70 │   │   │   theo_mz = np.array([psm.peptidoform.theoretical_mz for psm in psm_list])                   │
                             │    71 │   │   │   abs_ms1_error_ppm = np.abs((precursor_mzs - theo_mz) / theo_mz * 10**6)                    │
                             │    72 │   │   │   self._feature_names.append("abs_ms1_error_ppm")                                            │
                             │    73                                                                                                        │
                             │                                                                                                              │
                             │ /home/matteo/Projects/midia/pipelines/dev_paths/midia_pipe/software/ms2rescore/venv_ms2rescore/lib/python3.1 │
                             │ 1/site-packages/psm_utils/peptidoform.py:374 in theoretical_mz                                               │
                             │                                                                                                              │
                             │   371 │   │                                                                                                  │
                             │   372 │   │   """                                                                                            │
                             │   373 │   │   if self.precursor_charge:                                                                      │
                             │ ❱ 374 │   │   │   return mass_to_mz(self.theoretical_mass, self.precursor_charge)                            │
                             │   375 │   │   else:                                                                                          │
                             │   376 │   │   │   return None                                                                                │
                             │   377                                                                                                        │
                             │                                                                                                              │
                             │ /home/matteo/Projects/midia/pipelines/dev_paths/midia_pipe/software/ms2rescore/venv_ms2rescore/lib/python3.1 │
                             │ 1/site-packages/psm_utils/peptidoform.py:340 in theoretical_mass                                             │
                             │                                                                                                              │

with the following config:

{
    "$schema": "./config_schema.json",
    "ms2rescore": {
        "feature_generators": {
            "basic": {},
            "ms2pip": {
                "model": "timsTOF",
                "ms2_tolerance": 0.01
            },
            "deeplc": {
                "deeplc_retrain": true
            },
            "maxquant": {}
        },
        "rescoring_engine": {
            "mokapot": {
                "write_weights": true,
                "write_txt": true,
                "write_flashlfq": true
            }
        },
        "psm_reader_kwargs":{
        },
        "config_file": null,
        "psm_file": null,
        "psm_file_type": "infer",
        "spectrum_path": null,
        "output_path": null,
        "log_level": "info",
        "id_decoy_pattern": null,
        "psm_id_pattern": null,
        "spectrum_id_pattern": null,
        "lower_score_is_better": false,
        "modification_mapping": {
            "+15.9949":"U:35",
            "+57.0216":"U:4",
            "+42":"U:1"
        },
        "fixed_modifications": {},
        "processes": -1,
        "rename_to_usi": false,
        "fasta_file": null,
        "write_report": true
    }
}

version 3.0.0b5 fresh from github.

Any ideas?

Best wishes,

Matteo

MatteoLacki commented 7 months ago

Just a follow up: the same error met on v3.0.0-b1.

MatteoLacki commented 7 months ago

The matter was resolved by using precisely the m/z entries from unimod, i.e. by replacing: "modification_mapping": { "+15.9949":"U:35", "+57.0216":"U:4", "+42":"U:1" }, with: "modification_mapping": { "+15.994915":"U:35", "+57.021464":"U:4", "+42.010565":"U:1" },

MatteoLacki commented 7 months ago

I mean, it resolved my problem, but it sure still looks fishy to me.

MatteoLacki commented 7 months ago

Altough I got resuls I was getting a lot of errors in the log htat I did not see. The issue is still valid: image

I am trying to parse SAGE output. We had a pair coding session with @theGreatHerrLebert which involved moding the results.sage.tsv to contain direclty proper unimod codes, but this did not work either.

I think we need some more documentation on how to deal with the issue.

Best wishes,

RalfG commented 7 months ago

Hi Matteo,

I think (just a hunch, would have to check), that your second attempt with using the exact unimod masses actually just does not rename the modifications, so they remain unresolvable to DeepLC.

In terms of the actual issue, it might have something to do with recent changes in pyteomics.proforma (https://github.com/levitsky/pyteomics/pull/129), but I would have to verify. Could you perhaps send me (a section of) the PSM file you are using?

Thanks! Ralf

MatteoLacki commented 7 months ago

Hello Ralf,

if you mean the sage output, sure: here are first 99 entries plus header.

Best!

RalfG commented 7 months ago

The matter was resolved by using precisely the m/z entries from unimod

That's definitly strange. The keys in the mapping should exactly match what is in the PSM file, so in this case what is reported by Sage.

As I thought, the error is linked to recent changes in Pyteomics. I opened an issue (levitsky/pyteomics/issues/132). In the meantime, you could install an older version of Pyteomics (e.g., 4.6.2)

MatteoLacki commented 7 months ago

I could, but maybe rather you could put a limit on its version in your setup.py equivalent?

MatteoLacki commented 7 months ago

Point is that then you won't have any more people bothering you in this problem. Do you have a minimal requirements.txt with a working example?

RalfG commented 3 months ago

We indeed try to define dependencies that should work in the pyproject.toml. However, in this case, the problem was fixed in a patch release of Pyteomics (4.6.3), so the pip dependency resolver should always prefer that version over 4.6.2.

RalfG commented 3 months ago

Fixed in levitsky/pyteomics#133