modelica / ssp-ls-traceability

Prototyping of an SSP Traceability Layered Standard
Other
11 stars 4 forks source link

Error in validating SRMD schema #51

Closed ClemensLinnhoff closed 1 year ago

ClemensLinnhoff commented 1 year ago

I am trying to validate an SRMD file against the SRMD.xsd using xmllint. But I am getting a bunch of errors, because it cannot locate xlink.xsd: failed to load external entity "https://www.w3.org/XML/2008/06/xlink.xsd" Which leads to WXS schema /home/clemens/Repos/Other/SSPTraceability/SRMD.xsd failed to compile

You can find an example here

Anyone any idea why that is?

pmai commented 1 year ago

Just as an aside: xmllint, which is just a thin veneer around libxml2 is really not a good validation tool. Among many other limitations, it is often built without TLS/SSL support, hence will fail to load resources from https links, as in the example above; see https://stackoverflow.com/questions/38602855/xmllint-doesnt-work-with-https-warning-failed-to-load-external-entity

So this is really unrelated to SRMD. If you want to use xmllint, you will likely have to place the xlink schema in a local location and change the imports accordingly.

ClemensLinnhoff commented 1 year ago

I am also getting errors with other tools: Using java based amouat/xsd-validator results in:

error reading XML Schema: SSPTraceability/SRMD.xsd
src-resolve.4.2: Error resolving component 'xs:dateTimeStamp'.
It was detected that 'xs:dateTimeStamp' is in namespace 'http://www.w3.org/2001/XMLSchema', but components from this namespace are not referenceable from schema document 'file:SSPTraceability/STC.xsd'.
If this is the incorrect namespace, perhaps the prefix of 'xs:dateTimeStamp' needs to be changed.
If this is the correct namespace, then an appropriate 'import' tag should be added to 'file:SSPTraceability/STC.xsd'.

Using python based xmlschema according to their readme example results in:

xmlschema.validators.exceptions.XMLSchemaValidationError: failed validating <Element '{http://apps.pmsf.net/STMD/SimulationResourceMetaData}SimulationResourceMetaData' at 0x7f1c29c6aac0> with XMLSchema10(name='SRMD.xsd', namespace='http://apps.pmsf.net/STMD/SimulationResourceMetaData'):

Reason: <Element '{http://apps.pmsf.net/STMD/SimulationResourceMetaData}SimulationResourceMetaData' at 0x7f1c29c6aac0> is not an element of the schema

So 3 different xml schema validators result in 3 different errors with the SRMD schema. Do you have a recommendation for a validation tool that does work with this schema, @pmai?

pmai commented 1 year ago

These errors are related to missing support for XML Schema 1.1 (which was only released like 11 years ago, sigh) in the given tools. The SRMD schema is clearly marked as 1.1, and needs it for the xs:dateTimeStamp data type.

Besides the usual commercial tools, like OxygenXMl, XMLSpy, SaxonEE, which all support XML Schema 1.1, open source tools like xerces-j for java or https://pypi.org/project/xmlschema/ for python support XML Schema 1.1 validation. There are also online validators with API support around...

If you only require simple SRMD validation you could also change the xs:dateTimeStamp to xs:dateTime in the schema, which will not validate that the timeZone fragment is present in the time stamp, but otherwise would handle similarly.

ClemensLinnhoff commented 1 year ago

I have used https://pypi.org/project/xmlschema/, that was my last example. They state in the readme, that it supports XML 1.1 but it throws a different error as shown above. Any idea what the problem there is? It says that SimulationResourceMetaData is not an element of the schema. What am I doing wrong? I am just doing:

import xmlschema
xs = xmlschema.XMLSchema('SRMD.xsd')
xs.validate('sl-1-0-sensor-model-repository-template.srmd')
ClemensLinnhoff commented 1 year ago

With the above python code and this SRMD file I get the error:

xmlschema.validators.exceptions.XMLSchemaValidationError: failed validating <Element '{http://apps.pmsf.net/STMD/SimulationResourceMetaData}SimulationResourceMetaData' at 0x7fc654208180> with XMLSchema10(name='SRMD.xsd', namespace='http://apps.pmsf.net/STMD/SimulationResourceMetaData'):

Reason: <Element '{http://apps.pmsf.net/STMD/SimulationResourceMetaData}SimulationResourceMetaData' at 0x7fc654208180> is not an element of the schema

I also tried to validate the SimulationTask.stmd provided in the examples against the STMD.xsd, which produced a similar error.

To check if the validation works in general this way, I tried this generic example and it worked perfectly.

ClemensLinnhoff commented 1 year ago

The above mentioned error seems to have something to do with the versioning. If I remove the following lines from the SRMD.xsd, the error does not occur.

xmlns:vc="http://www.w3.org/2007/XMLSchema-versioning" elementFormDefault="qualified"
vc:minVersion="1.1"

However, then I get the next error:

missing group 'stc:GElementCommon'

I am uncertain, if this is actually missing from my SRMD file but is required to be in there, so the validator therefore correctly throws this error, or if it is another problem with the schema. An example or template for SRMD that works with the schema would really help here.

pmai commented 1 year ago

This is all still unrelated to SSPTraceability, but just usage errors in using xmlschema:

The xmlschema documentation clearly states:

Usage Import the library and then create a schema instance using the path of the file containing the schema as argument:

>>> import xmlschema >>> my_schema = xmlschema.XMLSchema('tests/test_cases/examples/vehicles/vehicles.xsd')

Note For XSD 1.1 schemas use the class XMLSchema11, because the default class XMLSchema is an alias of the XSD 1.0 validator class XMLSchema10.

The original error message also clearly indicates that it is trying to use the SRMD.xsd as a 1.0 schema:

xmlschema.validators.exceptions.XMLSchemaValidationError: failed validating <Element '{http://apps.pmsf.net/STMD/SimulationResourceMetaData}SimulationResourceMetaData' at 0x7fc654208180> with XMLSchema10(name='SRMD.xsd', namespace='http://apps.pmsf.net/STMD/SimulationResourceMetaData'):

[emphasis by me].

If I use xmlschema correctly, i.e.

import xmlschema xs = xmlschema.XMLSchema11('SRMD.xsd') xs.validate('examples/ARS84x.srmd')

then it will validate correctly, like all the other 1.1-capable tools.

ClemensLinnhoff commented 1 year ago

Thank you for the detailed explanation! By using XMLSchema11 the previously mentioned errors are gone. However, now I get the next error:

xmlschema.validators.exceptions.XMLSchemaChildrenValidationError: failed validating <Element '{http://apps.pmsf.net/STMD/SimulationResourceMetaData}SimulationResourceMetaData' at 0x7f3ac76c93a0> with Xsd11Group(ref='stc:GElementCommon', model='sequence', occurs=[1, 1]):

Reason: Unexpected child with tag 'srmd:Classification' at position 1.

What am I doing wrong now?

pmai commented 1 year ago

It is hard to say without the source document, but it seems like you are using the wrong namespace for the Classification element. Note that due to reusability, the SRMD format uses elements from the STC namespace, which is shared with the STMD format and other coming formats. So a valid example of the SRMD file could look like this:

<?xml version="1.0" encoding="UTF-8"?>
<srmd:SimulationResourceMetaData xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://apps.pmsf.net/STMD/SimulationResourceMetaData SRMD.xsd"
    xmlns:srmd="http://apps.pmsf.net/STMD/SimulationResourceMetaData"
    xmlns:stc="http://apps.pmsf.net/SSPTraceability/SSPTraceabilityCommon"
    version="0.5" name="ARS84x Model Meta-Data">
    <stc:Classification type="de.setlevel.srmd.model-meta-data">
        <stc:ClassificationEntry keyword="model.type">sensor</stc:ClassificationEntry>
        <stc:ClassificationEntry keyword="sensor.manufacturer">CompanyX</stc:ClassificationEntry>
    </stc:Classification>
</srmd:SimulationResourceMetaData>

or, using default namespaces - note the xmlns attribute on Classification - like this:

<?xml version="1.0" encoding="UTF-8"?>
<srmd:SimulationResourceMetaData xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://apps.pmsf.net/STMD/SimulationResourceMetaData SRMD.xsd"
    xmlns:srmd="http://apps.pmsf.net/STMD/SimulationResourceMetaData"
    version="0.5" name="ARS84x Model Meta-Data">
    <Classification xmlns="http://apps.pmsf.net/SSPTraceability/SSPTraceabilityCommon" type="de.setlevel.srmd.model-meta-data">
        <ClassificationEntry keyword="model.type">sensor</ClassificationEntry>
        <ClassificationEntry keyword="sensor.manufacturer">CompanyX</ClassificationEntry>
    </Classification>
</srmd:SimulationResourceMetaData>

Or, of course any other valid combination of namespace declarations that result in the SimulationResourceMetaData and the Classification / ClassificationEntry elements landing in the proper namespaces: http://apps.pmsf.net/STMD/SimulationResourceMetaData and http://apps.pmsf.net/SSPTraceability/SSPTraceabilityCommon respectively. Note that those namespaces will change to ssp-standard.org-based namespaces in the not so distant future when the spec is fully moving to the MA.

ClemensLinnhoff commented 1 year ago

Ah that fixes it! So using a schema validation already payed off. Most of the SRMD files from the SETLevel project are wrong in this regard, at least in all sensor models and also some of the other models. Thanks for the help!

pmai commented 1 year ago

Hmmm, at least the SRMD files in SETLevel I looked at at the time, like the reflection-based radar object model, as well as all the samples and templates validated correctly. However it seems that at some later time someone corrupted one of them and other people copied from that (instead of the template)...

ClemensLinnhoff commented 1 year ago

I think it was the other way round. If I remember correctly, the first models in SETLevel with srmd files were the lidar models (reflexion- and object-based). The radar model did not exist at that point. The template was probably fixed at a point when the first srmd files were already implemented. Nevertheless, I will fix them now in OpenMSL.

pmai commented 1 year ago

At least my quick glance at the history seems to indicate otherwise: The original sensor model template was the new and final version 0.2 from 2021-06-17, the overall model template from which the current sensor model SRMDs seem to be derived from existed from 2021-10-25, again only in the correct form. The older 0.1 format did not even mention the STC namespace, so the corrupted template cannot have been based on that version either.

The object-based lidar object model added the SRMD file on 2022-04-05, the reflection-based radar object model on 2022-03-30, so close to each other, with the earlier addition being the correct one, the later one the corrupted version.

So I'd say in all likelihood some corrupted version of the template made the rounds, probably not on gitlab, some time afterwards and infected the other sensor models... Not all of them BTW, since e.g. the triangle-based radar reflection model has the correct SRMD, as does e.g. the Powertrain ICE model, whereas the image-based object detection model has the wrong one. Which again leads me to believe that some corrupted version made the rounds unofficially, and not everyone got a notice ;).