compomics / ms2rescore

Modular and user-friendly platform for AI-assisted rescoring of peptide identifications
https://ms2rescore.readthedocs.io
Apache License 2.0
39 stars 14 forks source link

Processing PEAKS XPro output using ms2rescore #44

Closed DavidGZ1 closed 2 years ago

DavidGZ1 commented 2 years ago

Hi MS2ReScore team, Thanks for creating MS2ReScore and for the recent implementation of PEAKS results compatibility.

We are trying to use MS2ReScore for immunopeptidomics to rescore timsTOF Pro data processed with PEAKS Xpro but it looks like it doesn’t recognize the mgf files, since it shows the error "Not all PSMs could be found in the provided MGF files”. Full description: I am working with samples from IP-enriched MHC-ligandome analyzed in a nanoElute coupled to timsTOF-Pro in DDA-PASEF. Several files were acquired and each one was processed individually in PEAKS Xpro to identify possible immunopeptides (unspecific cleavage). The .mgf and .mzid files were exported by selecting Export / For Third Party / for PRIDE / Scaffold; also, the de novo only spectra.mgf was exported from Export / Text Formats. All the files were placed in the same folder. Then, MS2Rescore was configured (details below. Already when selecting “Spectrum file directory” the .mgf weren’t shown. When the process is started despite that, the following error is shown: "Not all PSMs could be found in the provided MGF files” MS2Rescore was run using the GUI, with the version downloaded on 23/02/2022, in Windows 10 Pro. Using the following parameters; model = Immuno-HCD, MS2 error = 0.03, pipeline = peaks, logging level = info, identification file = the .mzid file, spectrum file directory = directory containing the .mgfs, temporary file directory = same, configuration = (empty) and ouptut = (empty).

Could you please indicate me how to properly process PEAKS XPro data using ms2rescore?

Best, David Gomez-Zepeda

ArthurDeclercq commented 2 years ago

Hi David Gomez-Zepeda,

Thank you for your interest in MS²Rescore. The error that you get is because MS²Rescore doesn't find all of the spectra for all of the PSMs in your mzid file. This could be due to several reasons, most of the time it is because there's a bad matching between the PSM id and the mgf title pattern, yet normally because of the export through PEAKS this should not be the issue. The third-party export mgf files are right now the only files that we have used so far, without the de novo only spectra mgf so I don't know how the id mapping is with that file (maybe you could try without that mgf file in the directory). Also, it is normal that you cannot see the mgf files in the GUI because the GUI requires a directory as input instead of files. Other than that I don't really know where the issue is without taking a look at the files that you use, could you maybe provide me with these files or a snippet of the ids?

Kind regards, Arthur

DavidGZ1 commented 2 years ago

Hi Arthur, Thanks for your answer. I performed some more tests using v2.1.1 and I still have the same problem. I tried using an individual timsTOF Pro DDA-PASEF (.d) file processed in PEAKS XPro with the following options for data refinement and all gave the error "Not all PSMs could be found in the provided MGF files”.

I will send you the files by email. Kind regards, David