Closed maximskorik closed 1 year ago
Is the openms tool maybe able to do it? openms_xmlvalidator ?
@bgruening, that tool seems to be what we need and I somehow overlooked it when looking for a solution within existing Galaxy tools. However, I can't make it work. Not with mzML files and schemas, nor with some simple xml-xsd pairs (e.g., this one: https://learn.microsoft.com/en-us/previous-versions/windows/desktop/ms764613(v=vs.85)).
As a general question, would you be interested to upgrade this tool to be a validator for any XML schema? You could ship a few default ones but also make the schema an input and convert this tool into a very generic one - that can be used by many communities. Maybe even contribute it to IUC.
Sure, I wouldn't mind making the tool more generic if it can be useful for a greater community. @hechth, what do you think?
@bgruening, that tool seems to be what we need and I somehow overlooked it when looking for a solution within existing Galaxy tools. However, I can't make it work. Not with mzML files and schemas, nor with some simple xml-xsd pairs (e.g., this one: https://learn.microsoft.com/en-us/previous-versions/windows/desktop/ms764613(v=vs.85)).
Do you have an error report? Have you tried this on EU? We have contacts to the devs if this is relevant.
Do you have an error report? Have you tried this on EU? We have contacts to the devs if this is relevant.
Yes, on EU. I managed to make it work partially; there were problems with my test sample. The tool works with generic xml-xsd pairs and mzml-xsd v1.1.0
pairs. It still fails to validate mzMLs v1.1.1
. I suspect it's due to v1.1.1
containing a reference to v1.1.0
that is absent at the runtime. I don't know why that's not an issue with our validator.
Here's the history with failed v1.1.1
validations: https://usegalaxy.eu/u/ac0ea6f59b164798b8fba7d76d2a6fad/h/mzml-validation-1
@maximskorik sure, I think making a general purpose xml-xsd validator tool would be nice.
The openms_fileinfo tool also seems to struggle with new orbi mzml files: https://umsa.cerit-sc.cz/u/hechth/h/20230119-openms-fileinfo-test
@bernt-matthias are the openms galaxy tools somehow auto generated or manually curated?
@bgruening and @maximskorik I think this can be merged and maybe we can make a general purpose xml validator in the next iteration?
I also think that this tool is somewhat complementary to the openms_fileinfo
tool @bernt-matthias and @sneumann
The tool on its own is great, just added two more comments. An extension to be more general would be great, maybe create an issue for that?
@bernt-matthias are the openms galaxy tools somehow auto generated or manually curated?
Yes they are.
The conversion of the tools happens here https://github.com/galaxyproteomics/tools-galaxyp/blob/423304b26e63d23cd8e5fb4c2fb729c5beea1254/tools/openms/generate.sh#L62 .. based on the CTD files that are written by the OpenMS tools (https://github.com/galaxyproteomics/tools-galaxyp/blob/423304b26e63d23cd8e5fb4c2fb729c5beea1254/tools/openms/test-data.sh#L143).
The tool on its own is great, just added two more comments. An extension to be more general would be great, maybe create an issue for that?
I guess we can create an issue on the iuc github repo and start contributing such general puspose tools there?
The tool on its own is great, just added two more comments. An extension to be more general would be great, maybe create an issue for that?
I guess we can create an issue on the iuc github repo and start contributing such general puspose tools there?
Yes :)
@bernt-matthias Yeah that makes sense - there are plenty of them and it would be quite hard to update all of them manually I assume
The failing OpenMS XMLValidator might be caused by a tool bug. On the command line the schema files are named .bioml
. This is because the automatic mapping between OpenMS and Galaxy datatypes (i.e. extensions) fails here (https://github.com/galaxyproteomics/tools-galaxyp/blob/423304b26e63d23cd8e5fb4c2fb729c5beea1254/tools/openms/XMLValidator.xml#L21).
I could try to fix this. Maybe here https://github.com/galaxyproteomics/tools-galaxyp/pull/697 .. is this desired?
Maybe someone could test the tool on the command line 1st .. if you have some file pairs I could do it as well.
Manually curated tests could be added here https://github.com/galaxyproteomics/tools-galaxyp/blob/423304b26e63d23cd8e5fb4c2fb729c5beea1254/tools/openms/aux/macros_test.xml#L568 but ideally we would add them upstream since Galaxy tests are also autogenerated from the OpenMS test command lines in this file: https://github.com/OpenMS/OpenMS/blob/develop/src/tests/topp/CMakeLists.txt
Note that the tool claims to check against the latest schema of the corresponding type
XMLValidator should/could be the general purpose tool ..?
Description
This PR adds a Galaxy tool to validate mzML files against HUPO XML Schema Definition (XSD) versions
1.1.1
and1.1.0
(fetched from https://www.psidev.info/mzML).The tool: