compomics / ms2rescore

Modular and user-friendly platform for AI-assisted rescoring of peptide identifications
https://ms2rescore.readthedocs.io
Apache License 2.0
49 stars 15 forks source link

Error with PEAKS mzid file #53

Closed nh2tran closed 11 months ago

nh2tran commented 2 years ago

Hello,

I'm trying to run MS2Rescore on the mzid and mgf files exported from PEAKS (using the export option for third party PRIDE / scaffold). However, I got the error below. Really appreciate your help! I need to run MS2Rescore on PEAKS results, but the gui-windows version gave me installation error (posted in another issue), and the linux command-line version gave the error below.

`ms2rescore -m mgf/ peptides_1_1_0.mzid 2022-03-11 07:53:45 // INFO // ms2rescore // Using MSGFPipeline. 2022-03-11 07:53:45 // INFO // ms2rescore.percolator // Running Percolator PIN converter

Pin-converter version 3.05.0, Build Date Aug 31 2020 19:06:15 Copyright (c) 2013 Lukas Käll. All rights reserved. Written by Lukas Käll (lukas.kall@scilifelab.se) in the School of Biotechnology at KTH - Royal Institute of Technology, Stockholm. Issued command: msgf2pin -P XXX -o /tmp/tmpsb1le34g/peptides_1_1_0_original.pin /data/nh2tran/DeepNovo/DeepDB/PEAKS_Online/ms2rescore/peptides_1_1_0.mzid

Error : the input file is not MzIdentML - MSGF+ format /data/nh2tran/DeepNovo/DeepDB/PEAKS_Online/ms2rescore/peptides_1_1_0.mzid

2022-03-11 07:53:45 // ERROR // ms2rescore.main // Critical error occured in MS2ReScore Traceback (most recent call last): File "/data/nh2tran/python3_tf2/lib/python3.8/site-packages/ms2rescore/main.py", line 15, in main rescore.run() File "/data/nh2tran/python3_tf2/lib/python3.8/site-packages/ms2rescore/init.py", line 233, in run peprec = self.pipeline.get_peprec() File "/data/nh2tran/python3_tf2/lib/python3.8/site-packages/ms2rescore/id_file_parser.py", line 245, in get_peprec return self.peprec_from_pin() File "/data/nh2tran/python3_tf2/lib/python3.8/site-packages/ms2rescore/id_file_parser.py", line 179, in peprec_from_pin peprec = self.original_pin.to_peptide_record( File "/data/nh2tran/python3_tf2/lib/python3.8/site-packages/ms2rescore/id_file_parser.py", line 169, in original_pin self._run_percolator_converter() File "/data/nh2tran/python3_tf2/lib/python3.8/site-packages/ms2rescore/id_file_parser.py", line 144, in _run_percolator_converter run_percolator_converter( File "/data/nh2tran/python3_tf2/lib/python3.8/site-packages/ms2rescore/percolator.py", line 532, in run_percolator_converter subprocess.run(command, capture_output=log_level == "debug", check=True) File "/usr/lib/python3.8/subprocess.py", line 516, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['msgf2pin', '-P', 'XXX', '-o', '/tmp/tmpsb1le34g/peptides_1_1_0_original.pin', '/data/nh2tran/DeepNovo/DeepDB/PEAKS_Online/ms2rescore/peptides_1_1_0.mzid']' returned non-zero exit status 1.`

ArthurDeclercq commented 2 years ago

Hi @nh2tran,

Sorry for the late reply! It seems that MS²Rescore is trying to use the wrong pipeline. This is because both the MSGF and PEAKS pipeline use mzid input files. By default the when inferring the pipeline with mzid MS²Rescore will use the MSGF pipeline, therefore you have to specifically specify that the PEAKS pipeline should be used. You can do this within the config file (set general pipeline to 'peaks') or specify the --pipeline 'peaks' in the CLI.

Best, Arthur

ab604 commented 1 year ago

I also have what I think is the same or related issue when I tried to run MSResScore on Ubuntu 18.04 LTS. I had to install percolater 3.4 for this version of Ubuntu as otherwise the glibc libraries aren't recent enough. The data was generated by Peaks X. On the README it says the pipeline must be one of ['infer', 'pin', 'tandem', 'maxquant', 'msgfplus', 'peptideshaker'] so I don't quite follow how I supply peaks as the pipeline following your answer above?

Many thanks in advance, and here's the output below in case I've misunderstood the error:

2022-11-17 08:55:49 // INFO // ms2rescore // Using MSGFPipeline. 2022-11-17 08:55:49 // INFO // ms2rescore.percolator // Running Percolator PIN converter 2022-11-17 08:55:49 // ERROR // __main__ // Critical error occured in MS2ReScore Traceback (most recent call last): File "/home/ab604/.local/lib/python3.6/site-packages/ms2rescore/__main__.py", line 15, in main rescore.run() File "/home/ab604/.local/lib/python3.6/site-packages/ms2rescore/__init__.py", line 203, in run peprec = self.pipeline.get_peprec() File "/home/ab604/.local/lib/python3.6/site-packages/ms2rescore/id_file_parser.py", line 234, in get_peprec return self.peprec_from_pin() File "/home/ab604/.local/lib/python3.6/site-packages/ms2rescore/id_file_parser.py", line 171, in peprec_from_pin peprec = self.original_pin.to_peptide_record( File "/home/ab604/.local/lib/python3.6/site-packages/ms2rescore/id_file_parser.py", line 160, in original_pin self._run_percolator_converter() File "/home/ab604/.local/lib/python3.6/site-packages/ms2rescore/id_file_parser.py", line 140, in _run_percolator_converter log_level=self.log_level File "/home/ab604/.local/lib/python3.6/site-packages/ms2rescore/percolator.py", line 502, in run_percolator_converter subprocess.run(command, capture_output=log_level == "debug", check=True) File "/usr/lib/python3.6/subprocess.py", line 423, in run with Popen(*popenargs, **kwargs) as process: TypeError: __init__() got an unexpected keyword argument 'capture_output'

ArthurDeclercq commented 1 year ago

Hi @ab604!

Thank you for reporting this, it seems like we have forgotten to add peaks as option in the README, but it does have the option to add peaks as pipeline alongside the others mentioned in the README. I will update the README but you can go ahead and use it already with the latest version of MS²Rescore!

Cheers, Arthur

ab604 commented 1 year ago

Thanks Arthur. I should have put this in my original issue, but I get an error saying that peaks is not an option. Can you clarify how I pass it as an argument? I installed MS2Rescore yesterday so it should be the latest version.

Many thanks,

Alistair

From: ArthurDeclercq @.> Sent: 17 November 2022 13:16 To: compomics/ms2rescore @.> Cc: Alistair Bailey @.>; Mention @.> Subject: Re: [compomics/ms2rescore] Error with PEAKS mzid file (Issue #53)

CAUTION: This e-mail originated outside the University of Southampton.

Hi @ab604https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fab604&data=05%7C01%7Cab604%40soton.ac.uk%7C03762b7c5328411e9e5d08dac89dd88d%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638042877517440617%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=7AbJy1xZ0zBGvkd6EvEQD5Yd%2BvQsXvlPb56bAM07JTo%3D&reserved=0!

Thank you for reporting this, it seems like we have forgotten to add peaks as option in the README, but it does have the option to add peaks as pipeline alongside the others mentioned in the README. I will update the README but you can go ahead and use it already with the latest version of MS²Rescore!

Cheers, Arthur

- Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fcompomics%2Fms2rescore%2Fissues%2F53%23issuecomment-1318619310&data=05%7C01%7Cab604%40soton.ac.uk%7C03762b7c5328411e9e5d08dac89dd88d%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638042877517440617%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=IbRV%2BeYgKry0M1nZeTstSShND%2FywLch0kr%2BQt5y%2B5SY%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FACMAS2E6TD2T465UWUBWC5LWIYVYHANCNFSM5QO4NGAA&data=05%7C01%7Cab604%40soton.ac.uk%7C03762b7c5328411e9e5d08dac89dd88d%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638042877517440617%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=qesiV5Ygniv3ndhcUKun%2FECRW%2BY53gDW3jpZTTtcNFQ%3D&reserved=0. You are receiving this because you were mentioned.Message ID: @.**@.>>

ab604 commented 1 year ago

This is what happens if I put peaks as the pipeline argument in my config file:

Critical error occured in MS2ReScore
Traceback (most recent call last):
  File "/home/ab604/.local/lib/python3.6/site-packages/ms2rescore/__main__.py", line 14, in main
    rescore = MS2ReScore(parse_cli_args=True, configuration=None, set_logger=True)
  File "/home/ab604/.local/lib/python3.6/site-packages/ms2rescore/__init__.py", line 44, in __init__
    parse_cli_args=parse_cli_args, config_class=configuration
  File "/home/ab604/.local/lib/python3.6/site-packages/ms2rescore/config_parser.py", line 154, in parse_config
    config = cascade_conf.parse()
  File "/home/ab604/.local/lib/python3.6/site-packages/cascade_config.py", line 106, in parse
    jsonschema.validate(config, self.validation_schema.load())
  File "/home/ab604/.local/lib/python3.6/site-packages/jsonschema/validators.py", line 934, in validate
    raise error
jsonschema.exceptions.ValidationError: 'peaks' is not one of ['infer', 'pin', 'tandem', 'maxquant', 'msgfplus', 'peptideshaker']

Failed validating 'enum' in schema['properties']['general']['properties']['pipeline']:
    {'default': 'infer',
     'description': 'Pipeline to use, depending on input format',
     'enum': ['infer',
              'pin',
              'tandem',
              'maxquant',
              'msgfplus',
              'peptideshaker'],
     'type': 'string'}

On instance['general']['pipeline']:
    'peaks'
ab604 commented 1 year ago

And if I add --pipeline 'peaks' to my command as per your message on Mar 28 I just get

usage: __main__.py [-h] [-v] [-m FILE] [-c FILE] [-t PATH] [-o FILE] [-l LEVEL] identification_file __main__.py: error: unrecognized arguments: --pipeline peaks

ArthurDeclercq commented 1 year ago

That is strange peaks should be an option in the latest version. Just to be sure could you show me the output of pip show ms2rescore

ab604 commented 1 year ago

Yes, I think we're getting somewhere, it looks like I have an older version Name: ms2rescore Version: 2.0.0b4. Foolishly I didn't install with a script so I can't easily see how that happened when I used pip. I'll grab the latest version and confirm if everything works.

ab604 commented 1 year ago

Ok, so I've got version 2.1.3 installed (my default version of pip was an old one) and I now get an error referring to a missing offset json file

2022-11-17 14:08:24 // INFO // ms2rescore // Using PeaksPipeline.
2022-11-17 14:08:24 // INFO // ms2rescore.id_file_parser // Processing mzid file
2022-11-17 14:08:24 // ERROR // __main__ // Critical error occured in MS2ReScore
Traceback (most recent call last):
  File "/home/ab604/.local/lib/python3.8/site-packages/pyteomics/auxiliary/file_helpers.py", line 540, in _build_index
    self._read_byte_offsets()
  File "/home/ab604/.local/lib/python3.8/site-packages/pyteomics/xml.py", line 1196, in _read_byte_offsets
    with open(self._byte_offset_filename, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: './test-data/peptides-EN-181-T-MHC1_1_1_0-mzid-byte-offsets.json'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ab604/.local/lib/python3.8/site-packages/ms2rescore/__main__.py", line 15, in main
    rescore.run()
  File "/home/ab604/.local/lib/python3.8/site-packages/ms2rescore/__init__.py", line 233, in run
    peprec = self.pipeline.get_peprec()
  File "/home/ab604/.local/lib/python3.8/site-packages/ms2rescore/id_file_parser.py", line 595, in get_peprec
    self.df = self.read_df_from_mzid()
  File "/home/ab604/.local/lib/python3.8/site-packages/ms2rescore/id_file_parser.py", line 517, in read_df_from_mzid
    with mzid.read(self.path_to_id_file) as reader:
  File "/home/ab604/.local/lib/python3.8/site-packages/pyteomics/mzid.py", line 232, in read
    return MzIdentML(source, **kwargs)
  File "/home/ab604/.local/lib/python3.8/site-packages/pyteomics/mzid.py", line 143, in __init__
    super(MzIdentML, self).__init__(*args, **kwargs)
  File "/home/ab604/.local/lib/python3.8/site-packages/pyteomics/xml.py", line 1066, in __init__
    self._build_index()
  File "/home/ab604/.local/lib/python3.8/site-packages/pyteomics/auxiliary/file_helpers.py", line 84, in wrapped
    return func(self, *args, **kwargs)
  File "/home/ab604/.local/lib/python3.8/site-packages/pyteomics/auxiliary/file_helpers.py", line 542, in _build_index
    super(IndexSavingMixin, self)._build_index()
  File "/home/ab604/.local/lib/python3.8/site-packages/pyteomics/auxiliary/file_helpers.py", line 84, in wrapped
    return func(self, *args, **kwargs)
  File "/home/ab604/.local/lib/python3.8/site-packages/pyteomics/xml.py", line 1100, in _build_index
    self._offset_index = TagSpecificXMLByteIndex.build(
  File "/home/ab604/.local/lib/python3.8/site-packages/pyteomics/xml.py", line 982, in build
    indexer = cls(source, indexed_tags, keys)
  File "/home/ab604/.local/lib/python3.8/site-packages/pyteomics/xml.py", line 938, in __init__
    self.build_index()
  File "/home/ab604/.local/lib/python3.8/site-packages/pyteomics/xml.py", line 965, in build_index
    self.offsets = scanner.build_byte_index(self.indexed_tag_keys)
  File "/home/ab604/.local/lib/python3.8/site-packages/pyteomics/auxiliary/file_helpers.py", line 84, in wrapped
    return func(self, *args, **kwargs)
  File "/home/ab604/.local/lib/python3.8/site-packages/pyteomics/xml.py", line 891, in build_byte_index
    k = attrs[lookup_id_key_mapping[offset_type]].decode('utf-8')
KeyError: b'id'
ArthurDeclercq commented 1 year ago

could you show me the command you are running? The error occurs when trying to read the mzid file, how did you obtain the mzid file?

ab604 commented 1 year ago

So this data is part some recent data we published and available as part of PRIDE PXD031108. I can send the files I'm trying to test with directly via our SafeSend secure file transfer service if you want to try and run them yourself?

The mzid was outputted by PEAKS Studio version=10.0 (2019-01-29) and my CLI is:

python3.8 -m ms2rescore ./test-data/peptides-EN-181-T-MHC1_1_1_0.mzid -m ./test-data/ \
-o /belle/workspace/analysis/msrescore/ --pipeline 'peaks'
RalfG commented 11 months ago

Hi @ab604,

Could you retry with the latest beta version of MS²Rescore 3.0? PEAKS mzid files should now be fully supported. You can install it with:

pip install ms2rescore --pre

or download the desktop app installer from https://github.com/compomics/ms2rescore/releases/latest.

Let us know if you do run into any other issues!

ab604 commented 11 months ago

Hi Ralf,

I'm afraid I've left science for the time being so I'm not using Peaks anymore. I guess you can mark this as closed.

All the best,

Alistair

From: Ralf Gabriels @.> Sent: 13 October 2023 20:58 To: compomics/ms2rescore @.> Cc: Alistair Bailey @.>; Mention @.> Subject: Re: [compomics/ms2rescore] Error with PEAKS mzid file (Issue #53)

CAUTION: This e-mail originated outside the University of Southampton.

Hi @ab604https://github.com/ab604,

Could you retry with the latest beta version of MS²Rescore 3.0? PEAKS mzid files should now be fully supported. You can install it with:

pip install ms2rescore --pre

or download the desktop app installer from https://github.com/compomics/ms2rescore/releases/latest.

Let us know if you do run into any other issues!

- Reply to this email directly, view it on GitHubhttps://github.com/compomics/ms2rescore/issues/53#issuecomment-1762130061, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACMAS2CGU3BK7FCDQVYFBCDX7GMMTANCNFSM5QO4NGAA. You are receiving this because you were mentioned.Message ID: @.**@.>>