admin-shell-io / aas-specs

Repository of the Asset Administration Shell Specification IDTA-01001 - Metamodel
https://admin-shell-io.github.io/aas-specs-antora/index/home/index.html
Creative Commons Attribution 4.0 International
47 stars 26 forks source link

Update the schemata to v3.1 of the specification #357

Open s-heppner opened 7 months ago

s-heppner commented 7 months ago

These schemata are not tested yet and are meant for reference only. They are not the offical final version.

mristin commented 7 months ago

I'll try to have a look at this tonight.

mristin commented 7 months ago

@s-heppner I think I reproduced the issue in a small schema:

<?xml version="1.0" encoding="UTF-8" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="something">
    <xs:simpleType>
          <xs:restriction base="xs:string">
               <xs:pattern value="[ -\ud7ff\uf900-\ufdcf\ufdf0-\uffef\U00010000-\U0001fffd])"/>
          </xs:restriction>
    </xs:simpleType>
</xs:element>
</xs:schema>

This schema does not validate with https://www.liquid-technologies.com/online-xsd-validator.

I'm looking into how to represent character code points above Basic Multilingual Plane (https://en.wikipedia.org/wiki/Plane_(Unicode)).

mristin commented 7 months ago

After some search, it seems that this is highly dependent on the XSD validator used. We currently use a validator based on C# in the continuous integration, so the patterns are directly forwarded to the C# regex engine.

As C# supports only UTF-16, the code points above BMP can not be represented. The solution would be to expand the pattern so that only UTF-16 ranges are used. This is closely related to #362. Whatever the solution in #362, the same patch needs to be applied here as well.

We haven't noticed this problem thus far as no code points above BMP have appeared in XSD.

Just for future reference: a possible solution is to patch aas-core-codegen to fix patterns in XSD so that they only operate on UTF-16 characters and ranges.