HUPO-PSI / mzTab

mzTab Reporting MS-based Proteomics and Metabolomics Results
https://hupo-psi.github.io/mzTab
39 stars 17 forks source link

How to encode non-numeric id_confidence_value values? #140

Closed nilshoffmann closed 6 years ago

nilshoffmann commented 6 years ago

For certain applications, defining a confidence measure for reporting in the SML section may be possible, however, a numeric confidence value may not be defined/definable. Should we widen this type to support Strings, so that users can define a qualitative confidence measure instead of a quantitative one? Or should these be reported as numeric levels?

germa commented 6 years ago

We have already

[Term] id: MS:1002896 name: compound identification confidence level def: "Confidence level for annotation of identified compounds as defined by the Metabolomics Standards Initiative (MSI). The value slot can have the values 'Level 0' until 'Level 4'." [PMID:29748461] xref: value-type:xsd\:string "The allowed value-type for this CV term." is_a: MS:1002895 ! small molecule identification attribute

nilshoffmann commented 6 years ago

A proposed new term to capture more characteristics of high-resolution MS-workflows would be this:

[Term]
id: MS100XXXX
name: hr-ms compound identification confidence level
def: "Refined HR-MS confidence level for annotation of identified compounds as propose by Schymanski et al. The value slot can have the values 'Level 1', 'Level 2', 'Level 2a', 'Level 2b', 'Level 3', 'Level 4', and 'Level 5'." [PMID:24476540]
xref: value-type:xsd:string "The allowed value-type for this CV term."
is_a: MS:1002895 ! small molecule identification attribute

This would indeed favor a "String" data type for the value, while possibly sacrificing sortability?

germa commented 6 years ago

If these Schymanski-levels are only a more fine-grained version of MS:1002896, then I propose to change the definition of MS:1002896 accordingly instead of adding a new term, or are these two classifications used side by side?

nilshoffmann commented 6 years ago

@rsalek Do you happen to have any updates on the reporting levels from the MSI side? I recall that they were being reworked / updated?

andrewrobertjones commented 6 years ago

@nilshoffmann to update 6.2.57. small_molecule-identification_reliability with an example new CV term