unitsml / schemas

UnitsML schemas
4 stars 3 forks source link

Support on-fly generation xsd sub-part for dynamic data #3

Closed Intelligent2013 closed 3 years ago

Intelligent2013 commented 3 years ago

unitsml-v1.0-csd04.xsd has a few enumeration and some of them stored in unitsdb repo (https://github.com/unitsml/unitsdb/):

  1. Prefixes
    <xsd:attribute name="prefix">
    <xsd:annotation>
        <xsd:documentation>Prefix identifier; e.g., m, k, M, G.  [Enumeration order is by prefix magnitude (Y to y) followed by binary prefixes.]</xsd:documentation>
    </xsd:annotation>
    <xsd:simpleType>
        <xsd:restriction base="xsd:token">
            <xsd:enumeration value="Y"/>
            <xsd:enumeration value="Z"/>
            <xsd:enumeration value="E"/>
            <xsd:enumeration value="P"/>
            ...

UnitsDB: https://github.com/unitsml/unitsdb/blob/master/prefixes.yaml

NISTp10_24:
  name: yotta
  symbol:
    ascii: Y
    html: Y
    latex: Y
    unicode: Y
  base: 10
  power: 24

NISTp10_21:
  name: zetta
  symbol:
    ascii: Z
    html: Z
    latex: Z
    unicode: Z
  base: 10
  power: 21
...
  1. Units
<xsd:attribute name="unit" use="required">
    <xsd:annotation>
        <xsd:documentation>Unit identifier; the enumerated list is basically English unit names in lowercase, with a few upper case exceptions, e.g., 32F, mmHg, pH.</xsd:documentation>
    </xsd:annotation>
    <xsd:simpleType>
        <xsd:restriction base="xsd:token">
            <xsd:enumeration value="meter"/>
            <xsd:enumeration value="gram"/>
            <xsd:enumeration value="second"/>
            <xsd:enumeration value="ampere"/>
            <xsd:enumeration value="kelvin"/>
            <xsd:enumeration value="mole"/>
            ...

UnitsDB: https://github.com/unitsml/unitsdb/blob/master/units.yaml

"NISTu1":
  dimension_url: "#NISTd1"
  short: meter
  root: true
  unit_system:
    type: "SI_base"
    name: "SI"
  unit_name:
    - "meter"
  unit_symbols:
    - ascii: "m"
      html: "m"
      latex: \ensuremath{\mathrm{m}}
      unicode: "m"
  root_units:
  ...
  1. Type of the quantity:
    <xsd:attribute name="quantityType" use="optional">
    <xsd:annotation>
        <xsd:documentation>Type of the quantity.  For example base or derived.</xsd:documentation>
    </xsd:annotation>
    <xsd:simpleType>
        <xsd:restriction base="xsd:token">
            <xsd:enumeration value="base"/>
            <xsd:enumeration value="derived"/>
        </xsd:restriction>
    </xsd:simpleType>
    <!-- REVISE -->
    </xsd:attribute>

There arent't such data in UnitsDB, so it fixed in xsd.

  1. The base of the prefix system:
<xsd:attribute name="prefixBase" default="10">
    <xsd:annotation>
        <xsd:documentation>The base of the prefix system, i.e., 10 (SI) or 2 (binary).</xsd:documentation>
    </xsd:annotation>
    <xsd:simpleType>
        <xsd:restriction base="xsd:byte">
            <xsd:enumeration value="10"/>
            <xsd:enumeration value="2"/>
        </xsd:restriction>
    </xsd:simpleType>
</xsd:attribute>

There arent't such data in UnitsDB, so it fixed in xsd.

  1. Type of symbol representation:
    <xsd:attribute name="type" use="required">
    <xsd:annotation>
        <xsd:documentation>Type of symbol representation.  Examples include ASCII, unicode, HTML, and MathML.</xsd:documentation>
    </xsd:annotation>
    <xsd:simpleType>
        <xsd:union memberTypes="xsd:token">
            <xsd:simpleType>
                <xsd:restriction base="xsd:token">
                    <xsd:enumeration value="ASCII"/>
                    <xsd:enumeration value="Unicode"/>
                    <xsd:enumeration value="LaTeX"/>
                    <xsd:enumeration value="HTML"/>
                    <xsd:enumeration value="MathML"/>
                    <xsd:enumeration value="SVG"/>
                </xsd:restriction>
            </xsd:simpleType>
        </xsd:union>
    </xsd:simpleType>
    </xsd:attribute>

There arent't such data in UnitsDB, so it fixed in xsd.

Proposals

  1. Rename XSD file to .template and replace enumeration entities with <boilerplate href="<entities_name>.xml"/>, where <entities_name>.xml is xml representation data from UnitsDB (see below). Example:

    <xsd:attribute name="prefix">
    <xsd:annotation>
        <xsd:documentation>Prefix identifier; e.g., m, k, M, G.  [Enumeration order is by prefix magnitude (Y to y) followed by binary prefixes.]</xsd:documentation>
    </xsd:annotation>
    <xsd:simpleType>
        <xsd:restriction base="xsd:token">
            <boilerplate href="prefixes.xml"/>
        </xsd:restriction>
    </xsd:simpleType>
    </xsd:attribute>
  2. On-fly convert value list from UnitsDB yaml files (prefixes.yaml and units.yaml) into xml format, like this: prefixes.xml

    
    <items>
    <item>Y</item>
    <item>Z</item>
    <item>E</item>
    <item>P</item>
    <item>T</item>
    ...
    </items>

3. On-fly in .template file replace `<boilerplate href="prefixes.xml"/>` to data from prefixes.xml via xslt to enumeration list:
```xml
                    <xsd:enumeration value="Y"/>
                    <xsd:enumeration value="Z"/>
                    <xsd:enumeration value="E"/>
                    <xsd:enumeration value="P"/>
                    <xsd:enumeration value="T"/>
  1. Resulted schema (.xsd) will contain data from UnitsDB.

In https://github.com/unitsml/schemas/tree/xsd_gen branch I've prepared .template file for 1., ruby scripts for 2., and xslt transformation for 3. Make generates .xsd file in the folder template as described.

@ronaldtse what do you think?

ronaldtse commented 3 years ago

@Intelligent2013 let me know if there is anything I should do here. Thanks.

Intelligent2013 commented 3 years ago

@ronaldtse I've updated the initial post, so please take a look.

ronaldtse commented 3 years ago

@Intelligent2013 the proposed approach is good, is it already implemented in #2? Thanks!

Intelligent2013 commented 3 years ago

@Intelligent2013 the proposed approach is good, is it already implemented in #2? Thanks!

@ronaldtse yes.