compomics / peptide-shaker

Interpretation of proteomics identification results
http://compomics.github.io/projects/peptide-shaker.html
48 stars 18 forks source link

mzid export error #343

Closed dominik-kopczynski closed 5 years ago

dominik-kopczynski commented 5 years ago

Chi Nguyen for ISAS (chi.nguyen@isas.de) has a PRIDE submission issue with a .mzid file generated by PeptideShaker 1.16.36 .

The file "secretome_290316_pride_submission.mzid" starts with <?xml version="1.0" encoding="UTF-8"?> ...

later a cvParam is defined using some special signs (ü)

On XML schema validation on PRIDE we get the error: ========================================== Xml schema validation ============= FILE: secretome 290316_pride_submission.mzid VALIDATION MESSAGE: (1) FATAL XML Parsing error detected on line 31, (2) Fatal Error message: The entity "uuml" was referenced, but not declared.

So the problem is that the offending 2 characters ü (0xFC) and – (0x96) should not be present in a file using the Unicode UTF-8 encoding.

It would be good, when PeptideShaker's mzIdentML exporter can handle that, e.g. by choosing a proper Unicode encoding.

hbarsnes commented 5 years ago

The mzIdentML files produced by PeptideShaker should already be using Unicode UTF-8 encoding. But seems like we missed a cv term. Could you tell me which CV term the error occurs in? Seems to be in the details about the submitter or the organization? Generally it is recommended to avoid non-standard characters here, but we can easily support this if needed.

hbarsnes commented 5 years ago

Just released PeptideShaker v1.16.37 which should solve the problem of non-standard characters in the submitter details section of the mzIdentML file. If this does not solve the problem, please let me know and I'll reopen the issue.