compomics / peptide-shaker

Interpretation of proteomics identification results
http://compomics.github.io/projects/peptide-shaker.html
48 stars 18 forks source link

Morpheus/MetaMorpheus mzIdent input trouble #276

Closed trishorts closed 4 years ago

trishorts commented 7 years ago

I'm trying to load a .mzid file into PS and running into trouble. Could be that the file doesn't contain all the values you folks are looking for or that it is in the wrong format. I don't see any documentation that details any of that. I wonder if you can help. I could provide an example file. I get the following error message:

Thu Jul 27 11:01:44 CDT 2017 Import process for 1 (Sample: ecoli, Replicate: 0)

Thu Jul 27 11:01:44 CDT 2017 Importing sequences from eColi_canonical_uniprot_170626_concatenated_target_decoy.fasta. Thu Jul 27 11:01:47 CDT 2017 FASTA file import completed. Thu Jul 27 11:01:47 CDT 2017 Importing gene mappings. Thu Jul 27 11:01:47 CDT 2017 Escherichia coli (strain K12) not available in Ensembl. Gene and GO annotation for this species will not be available. Thu Jul 27 11:01:47 CDT 2017 Establishing local database connection. Thu Jul 27 11:01:58 CDT 2017 Reading identification files. Thu Jul 27 11:01:58 CDT 2017 Parsing 08-08-16_Ecoli_LabelFree_f7_rep1-Calibrated_5ppmAroundZero.mzid. Thu Jul 27 11:01:58 CDT 2017 No PSM found in 08-08-16_Ecoli_LabelFree_f7_rep1-Calibrated_5ppmAroundZero.mzid. Thu Jul 27 11:01:58 CDT 2017 No identifications retained.

Thu Jul 27 11:01:58 CDT 2017 Importing Data Canceled!

mvaudel commented 7 years ago

Hi,

Sorry to read that you are encountering issues importing your file. Our requirements to load mzid files are rather minimal: provide peptide sequences, modifications, scores, and spectrum identifiers. The two latter can cause issues if we could not parse the score or if we cannot map back to the spectrum. Would it be possible for you to share with us your mzid file?

Thanks and apologies for the inconvenience,

Marc

trishorts commented 7 years ago

I'm sure it's with the file we created and not with your software. But, since I don't know how your software is reading it, it's hard for me to trouble shoot. I made a box account with the .mzid file that also contains the .mzml and .fasta I used to create it. You can see also the conversion fasta files generated by peptide shaker. If it's not too much trouble, I'd like to know where our .mzid is causing peptide shaker to break. https://uwmadison.box.com/s/cdf2ikjxs93r1v298l9snoh21bubg5la

mvaudel commented 7 years ago

Thanks! I will look into the file, hopefully next week, and come back to you.

hbarsnes commented 7 years ago

As the mzid file comes from Morpheus which depends on mzML (and not mgf) as the spectrum file format I'm afraid we won't be able to parse this file as input to PeptideShaker. We were at one point looking into supporting mzML, but this never got beyond the testing stage as far as I remember.

Are there any other search engines you can use instead of Morpheus?

trishorts commented 7 years ago

.....As a Morpheus developer trying to produce output that can be read in other software packages I'm gonna say no.... But, I can make mgf if that is what you need. Anything else I should be aware of?

hbarsnes commented 7 years ago

.....As a Morpheus developer trying to produce output that can be read in other software packages I'm gonna say no.... But, I can make mgf if that is what you need. Anything else I should be aware of?

Aha, I was not aware of that part. :)

I see that we already parse the two CV terms for the scores used in the mzid files, i.e. MS:1002662 (Morpheus:Morpheus score) and MS:1002354 (PSM-level q-value), so that part should be ok. Morpheus is also already added as a search engine we should be able to recognize. The only thing missing there is a PubMed ID that we can refer to. Which one do you prefer that we use?

That only leaves the issue with using mzML as input for the spectra. I haven't checked in a while, but last time I checked Morpheus did not support mgf as the spectrum format? Or has this changed?

On a sidenote, if you now support mgf (or could consider adding it?) we could then also potentially add Morpheus to SearchGUI? As I think the missing mgf support was the only reason why we did not look further into this.

trishorts commented 7 years ago

No worries. Actually we moved on from Morpheus about a year ago and have created MetaMorpheus https://github.com/smith-chem-wisc/MetaMorpheus which counts high res fragments in a similar way. MetaMorpheus has much greater functionality than Morpheus including calibration, ptm discovery, etc. I don't think we've created CV terms yet. More still relying morpheus.

Pubmed ID? Paper is going out for review now. Watch your inbox ;)

Morpheus does not support mgf but I'll make sure that MetaMorpheus will. The format seems simple compared with mzML. That was a real slog but we got it running most everywhere now. I will let you know as soon as the mgf is finished. I don't think that should take long.

hbarsnes commented 7 years ago

Sounds great! And yes, mgf is straightforward to work with compared to mzML. So if you don't need any of the additional annotation that the mzML format provides then supporting mgf should be easy.

Looking forward to the version supporting mgf then. :)

trishorts commented 7 years ago

We now have CV terms for MetaMorpheus in the latest release of the psi-ms.obo (see below). Also, if we convert mgf to mzML, then MetaMorpheus will do the search fine. Still working on native mgf search. Will you let me know when PeptideShaker can read/understand the new CV terms?

Dear proteomics community,

attached there's the new version 4.0.15 of the psi-ms.obo file.

It contains new terms for the MetaMorpheus search engine, new terms for the metabolomics software package XCMS, a new term for 'alternating polarity mode'.

New CV terms in version 4.0.15 of psi-ms.obo:

**** new terms for the MetaMorpheus search engine [Term] id: MS:1002826 name: MetaMorpheus def: "MetaMorpheus search engine." [https://github.com/smith-chem-wisc/MetaMorpheus] is_a: MS:1001456 ! analysis software

[Term] id: MS:1002827 name: MetaMorpheus:score def: "MetaMorpheus score for PSMs." [PSI:PI] xref: value-type:xsd\:double "The allowed value-type for this CV term." is_a: MS:1001143 ! PSM-level search engine specific statistic relationship: has_order MS:1002108 ! higher score better

[Term] id: MS:1002828 name: MetaMorpheus:protein score def: "MetaMorpheus score for protein groups." [PSI:PI] xref: value-type:xsd\:double "The allowed value-type for this CV term." is_a: MS:1002368 ! search engine specific score for protein groups relationship: has_order MS:1002108 ! higher score better

hbarsnes commented 7 years ago

Will you let me know when PeptideShaker can read/understand the new CV terms?

It was already on my todo list when I saw the terms added to the psi-ms ontology. :)

However, we are in the middle of refactoring the PeptideShaker backend for better memory handling etc., so I'm not sure when we'll be able to release a new version.

But if you're interested I can try to send you a beta version, which basically only adds the new CV terms, so that you can test the parsing of the MetaMorpheus data?

bgruening commented 6 years ago

@mvaudel @hbarsnes I was ask today if Morpheus is possible to get into PS. Is there any news about this.

hbarsnes commented 6 years ago

@bgruening The parsing of Morpheus mzIdentML files into PeptideShaker is working, but we are still waiting for the refactoring of PeptideShaker to support the new backend. It is still unclear when this work will be completed. @mvaudel Perhaps you can provide an updated estimate?

mvaudel commented 6 years ago

I would love to, but it would most likely overoptimistic ;) Based on my progress the past weeks I would say less than a month for a beta version, with the above disclaimer in mind!

Dmorgen commented 6 years ago

Hi,

Just would like to chime in - I would really appreciate it if you could tailor PeptideShaker to accept MetaMorpheus results, and especially the modification search it allows - such as glycans and amino acid substitutions.

Thanks! David.

hbarsnes commented 4 years ago

This issue should be solved in the new beta releases of SearchGUI and PeptideShaker. Please see https://groups.google.com/forum/#!topic/peptide-shaker/1ecY0IyMOBM for more details.

If the issue still exists in the beta releases, please open a new issue at https://github.com/compomics/peptide-shaker-2.0-issue-tracker/issues.

If the issue has been solved, it would also be great of you could let us know by replying to this original issue and letting us know. :)

hbarsnes commented 4 years ago

MetaMorpheus has been included in the latest release of SearchGUI and the resulting mzIdentML files can now be loaded in PeptideShaker. If it does not work as wanted or if you come across other issues with how we use MetaMorpheus, please let us know by opening a new issue.