HUPO-PSI / mzIdentML

Repository for mzIdentML and the corresponding examples
23 stars 24 forks source link

Validation issues with mzidLib_peaklist2a_plus_ecoli_versus_unimod_full_xtandem_fdr_threshold_groups.mzid #38

Closed edeutsch closed 8 years ago

edeutsch commented 8 years ago

ERROR: cvParam local FDR should have units, but it does not! ERROR: cvParam q-value for peptides should have units, but it does not! (the previous two may be errors in the CV itself?)

WARNING: MS:1001062 should be 'Mascot MGF format' instead of 'Mascot MGF file' WARNING: MS:1001189 should be 'modification specificity peptide N-term' instead of 'modification specificity N-term' WARNING: MS:1001330 should be 'X!Tandem:expect' instead of 'X!Tandem:expect' WARNING: MS:1001331 should be 'X!Tandem:hyperscore' instead of 'X!Tandem:hyperscore' WARNING: MS:1001401 should be 'X!Tandem xml format' instead of 'X!Tandem xml file' WARNING: MS:1001476 should be 'X!Tandem' instead of 'X!Tandem' WARNING: MS:1001868 should be 'distinct peptide-level q-value' instead of 'q-value for peptides' WARNING: MS:1002244 should be 'mzidLib:FalseDiscoveryRate' instead of 'mzidLib:FalseDiscoveryRat' WARNING: MS:1002404 should be 'count of identified proteins' instead of 'count of identified protein'

I'm not quite sure what to make of the X!Tandem business. That is the the way it is written in the OBO file, but I assume that is an escape character that is not to be included in the XML? Or? Should that even be in the OBO file? Is that a limitation of the OBO format? I assume so, because the OBO format uses an ! as a separater character in some places. This should not be repeated in the XML?

hbarsnes commented 8 years ago

@edeutsch We have 'X!Tandem:expect' and 'X!Tandem xml format' in our PeptideShaker mzid 1.2 example file, but I don't think you get the same error for our file? (At least Gerhard's validator does not.) So could be some special formatting issue? We don't do anything special on our side though regarding the '!'.

edeutsch commented 8 years ago

I agree. I believe that ! with no \ is the correct way to encode it in mzIdentML. I suspect the presence of the ! in the OBO file is an OBO syntax thing that should not be replicated to mzIdentML.

germa commented 8 years ago

Yes in the .obo file there is X\!Tandem used, but in our data files one must use X!Tandem. The .obo file needs this backslash in the .obo file since otherwise the BioPortal validator will complain.

andrewrobertjones commented 8 years ago

@germa

The units should also be removed from this one:

id: MS:1001250 name: local FDR def: "Result of quality estimation: the local FDR at the current position of a sorted list." [PSI:PI] xref: value-type:xsd:double "The allowed value-type for this CV term." is_a: MS:1001092 ! peptide identification confidence metric is_a: MS:1001198 ! protein identification confidence metric relationship: has_units UO:0000166 ! parts per notation unit relationship: has_units UO:0000187 ! percent

fawazghali commented 8 years ago

I have update the example file. @edeutsch can you please re-run the validator. Thanks. Fawaz

germa commented 8 years ago

The units are removed now from all FDR and q-value terms since version 3.90.0 of psi-ms.obo

edeutsch commented 8 years ago

I'm still seeing the following issues with this file: mzidLib_peaklist2a_plus_ecoli_versus_unimod_full_xtandem_fdr_threshold_groups.mzid WARNING: MS:1001330 should be 'X!Tandem:expect' instead of 'X!Tandem:expect' WARNING: MS:1001331 should be 'X!Tandem:hyperscore' instead of 'X!Tandem:hyperscore' WARNING: MS:1001401 should be 'X!Tandem xml format' instead of 'X!Tandem xml file' WARNING: MS:1001476 should be 'X!Tandem' instead of 'X!Tandem'

The other two mzid***\ files are now fine.

Maybe the xtandem one didn't get properly committed or pushed? It's date is older than the other two on my system.

andrewrobertjones commented 8 years ago

@fghali looks like one of our files still needs to be pushed, can you take a look, thanks

fawazghali commented 8 years ago

I have update the example file. @edeutsch can you please re-run the validator. Thanks. Fawaz

edeutsch commented 8 years ago

Looks fine now, except see issue #64 about schemaLocations.

fawazghali commented 8 years ago

I've update the schemaLocations.