wmo-im / iwxxm

XML schema and Schematron for aviation weather data exchange
https://old.wmo.int/wiswiki/tiki-index.php%3Fpage=TT-AvXML
48 stars 22 forks source link

Refactored version of IWXXM 2021-2RC2 #273

Closed blchoy closed 3 years ago

blchoy commented 3 years ago

This is the refactored version of IWXXM 2021-2RC2 with new schema directory structure. I finally managed to make the validation scripts run correctly. Basically, the files under sub-directory IWXXM are those to be uploaded to schemas.wmo.int or downloaded for local implementation. The reason for taking such a long time to change the validation script is the unfortunate fact that we are using both upper and lower case letters in schema file names and sub-directories which, for some OSs like Linux are not to be ignored.

@amilan17 I think I have finished all my work here and I will leave it to you to consider merging this branch into the IWXXM 2021-2RC2 branch, or leave it alone until after the FT procedures. Either is fine with me.

blchoy commented 3 years ago

Thank you for bringing up the issue for re-using packages by different versions of IWXXM. You mentioned that:

  1. Packages like airmet.xsd released under IWXXM 2021-2 will bear the same namespace as iwxxm.xsd. viz:

    iwxxm.xsd

    <schema targetNamespace="http://icao.int/iwxxm/2021-2" version="2021-2RC2" xmlns:iwxxm="http://icao.int/iwxxm/2021-2" ...>
        <include schemaLocation="../measure/3.0.0/measures.xsd"></include>
        <include schemaLocation="../common/3.1.0/common.xsd"></include>
        <include schemaLocation="../airmet/3.1.0/airmet.xsd"></include>
        ...

    airmet.xsd

    <schema targetNamespace="http://icao.int/iwxxm/2021-2" version="3.1.0RC2" xmlns:iwxxm="http://icao.int/iwxxm/2021-2" ...>
        ...
        <annotation>
            <documentation>AIRMET reporting constructs as defined in ICAO Annex 3 / WMO No. 49-2.
            ....
  2. Now assuming we have a new version of IWXXM 2030-1 but airmet.xsd remains in the same version as in IWXXM 2021-2. In this case if we have the following iwxxm.xsd it will not work because the the namespace of the iwxxm.xsd and the included XSDs are different:

    iwxxm.xsd

    <schema targetNamespace="http://icao.int/iwxxm/2030-1" version="2030-1RC3" xmlns:iwxxm="http://icao.int/iwxxm/2021-2" ...>
        <include schemaLocation="../measure/3.0.0/measures.xsd"></include>
        <include schemaLocation="../common/3.1.0/common.xsd"></include>
        <include schemaLocation="../airmet/3.1.0/airmet.xsd"></include>
        ...

    That means we will have to create another airmet.xsd (ditto common.xsd and measures.xsd if they are not changed) with the same namespace for inclusion in iwxxm.xsd:

    airmet.xsd

    <schema targetNamespace="http://icao.int/iwxxm/2030-1" version="3.1.0RC2" xmlns:iwxxm="http://icao.int/iwxxm/2021-2" ...>
        ...
        <annotation>
            <documentation>AIRMET reporting constructs as defined in ICAO Annex 3 / WMO No. 49-2.
            ....
  3. That means the re-factoring won't work with multiple IWXXM versions as a single airmet.xsd cannot have multiple namespaces. The original flat file structure may serve the purpose better:

iwxxm/2021-2/iwxxm.xsd  <-- Version 2021-2, namespace http://icao.int/iwxxm/2021-2
             airmet.xsd  <-- Version 3.1.0, namespace http://icao.int/iwxxm/2021-2
iwxxm/2030-1/iwxxm.xsd  <-- Version 2030-1, namespace http://icao.int/iwxxm/2030-1
             airmet.xsd  <-- Version 3.1.0, namespace http://icao.int/iwxxm/2030-1

The two airmet.xsd are different only in the namespace definition, the rest of them are the same.

  1. If you want to have airmet.xsd completely independent from IWXXM so that it can be reused by any IWXXM version, you will have to treat it as a standalone package as what we are doing with METCE and COLLECT, each of them will need to have its own namespace:

    metce.xsd

    <schema targetNamespace="http://def.wmo.int/metce/2013" version="1.2" ...>

    collect.xsd

    <schema targetNamespace="http://def.wmo.int/collect/2014" version="1.2" ...>

Views please?

efucile commented 3 years ago

are we discovering that IWXXM it's just a name and we have a number of different packages sharing a framework?

efucile commented 3 years ago

Can we try what is happening if targetNamespace does not have any version targetNamespace="http://icao.int/iwxxm"?

blchoy commented 3 years ago

are we discovering that IWXXM it's just a name and we have a number of different packages sharing a framework?

The original intend is to make sure we know if the format of say AIRMET is different across different versions of IWXXM. Take our previous case as an example. The following are IWXXM AIRMET instances under different versions of IWXXM:

AIRMET instance in IWXXM 2021-2:
```
<iwxxm:AIRMET xsi:schemaLocation="http://icao.int/iwxxm/2021-2 http://schemas.wmo.int/iwxxm/2021-2/iwxxm.xsd" ...>
    ... the rest of the content is the same
```

AIRMET instance in IWXXM 2030-1
```
<iwxxm:AIRMET xsi:schemaLocation="http://icao.int/iwxxm/2030-1 http://schemas.wmo.int/iwxxm/2030-1/iwxxm.xsd" ...>
    ... the rest of the content is the same
```

If you look into the schemas, you will find:

airmet.xsd in IWXXM 2021-2:
```
<schema targetNamespace="http://icao.int/iwxxm/2021-2" version="3.1.0" ...>
    ...
```

airmet.xsd in IWXXM 2030-1:
```
<schema targetNamespace="http://icao.int/iwxxm/2030-1" version="3.1.0"  ...>
    ...
```

So one can immediately tell from airmet.xsd in each IWXXM package that they are the same in every aspects (except the namespace). For version 3.0.0 or before, the version number in the airmet.xsd will be 2021-2 and 2030-1 respectively so there is no way to tell if the two schemas are the same.

Therefore the new versioning scheme gives a machine readable identification on the schemas without the need to go to external lookup. It does not, however, provide the framework for reusable packages. To achieve the latter one will need to move one step further to make them independent from IWXXM. i.e. use different namespaces for the packages. But this is beyond a minor change to the design. I have to say that this is something I missed too when we discuss the establishment of the new schema directory structure.

blchoy commented 3 years ago

Can we try what is happening if targetNamespace does not have any version targetNamespace="http://icao.int/iwxxm"?

I confirmed that this is a workable arrangement. In fact, I have created a new branch with both schemas for 2021-2 and 3.0.0 in place and a test document with both versions of IWXXM in a single COLLECT.

blchoy commented 3 years ago

Committed changes involving a new targetNameSpace of http://icao.int/iwxxm. Looks good to me.

blchoy commented 3 years ago

Can we try what is happening if targetNamespace does not have any version targetNamespace="http://icao.int/iwxxm"?

I confirmed that this is a workable arrangement...

May be I celebrated too early. Let me recap what has been done and its consequence.

We changed the targetNamespace from http://icao.int/iwxxm/2021-2 to http://icao.int/iwxxm in order to make XSDs of packages like TAF to be reusable by different versions of IWXXM. Recalling that iwxxm.xsd is the top level schema which links up XSDs of all necessary packages:

Fragment of iwxxm.xsd for version 2021-2RC2:

<schema targetNamespace="http://icao.int/iwxxm" version="2021-2RC2" xmlns:iwxxm="http://icao.int/iwxxm" ...>
    <include schemaLocation="../measures/3.0.0/measures.xsd"></include>
    <include schemaLocation="../common/3.1.0/common.xsd"></include>
    <include schemaLocation="../metFeature/1.0.0/metFeature.xsd"></include>

Now in an instance, selection/indication of the version of IWXXM making reference to will no longer be done through the namespace but the included schema:

<iwxxm:SIGMET xmlns:iwxxm="http://icao.int/iwxxm" xsi:schemaLocation="http://icao.int/iwxxm http://schemas.wmo.int/iwxxm/2021-2/iwxxm.xsd" ...>

or 

<iwxxm:SIGMET xmlns:iwxxm="http://icao.int/iwxxm" xsi:schemaLocation="http://icao.int/iwxxm http://schemas.wmo.int/iwxxm/3.0.0/iwxxm.xsd" ...>

And we are using the locality of the scope of a namespace to make it possible for COLLECT to carry 2 IWXXM instances in different versions at the same time:

<collect:MeteorologicalBulletin ...>
    <collect:meteorologicalInformation>
        <iwxxm:SIGMET xmlns:iwxxm="http://icao.int/iwxxm" xsi:schemaLocation="http://icao.int/iwxxm http://schemas.wmo.int/iwxxm/2021-2/iwxxm.xsd" ...>
            ...
        </iwxxm:SIGMET>
    </collect:meteorologicalInformation>
    <collect:meteorologicalInformation>
        <iwxxm:SIGMET xmlns:iwxxm="http://icao.int/iwxxm" xsi:schemaLocation="http://icao.int/iwxxm http://schemas.wmo.int/iwxxm/3.0.0/iwxxm.xsd" ...>
            ...
        </iwxxm:SIGMET>
    </collect:meteorologicalInformation>
</collectMeteorologicalBulletin>

It works, but I doubt it is a good practice.

Now the problem comes with schematron. In an assertion, it needs to traverse the tags of an IWXXM instance with XPath and the namespace comes into play:

Fragment of iwxxm.sch:

<sch:schema xmlns:sch="http://purl.oclc.org/dsdl/schematron" queryBinding="xslt2">
    <sch:title>Schematron validation</sch:title>
    <sch:ns prefix="iwxxm" uri="http://icao.int/iwxxm"/>
    ...
    <sch:pattern>
        <sch:rule context="//iwxxm:AerodromeRunwayState">
            <sch:assert test="( if( @allRunways = 'true' ) then( empty(iwxxm:runway) ) else( true() ) )">METAR_SPECI.AerodromeRunwayState-1: When all runways are being reported upon, no specific runway should be reported</sch:assert>
        </sch:rule>
    </sch:pattern>

We may be able to apply the same trick to make the schematron rules for version 2021-2 to confine its scope to those part of the IWXXM instance having schemaLocation of http://schemas.wmo.int/iwxxm/2021-2/iwxxm.xsd without using namespace as identifier, but this will make the rules (1) complicated, (2) rely on a changeable physical location of iwxxm.xsd to identify the version of IWXXM the fragment of instance is based on, and (3) may not be sustainable in long run.

I personally think that this has gone beyond the intended use of XML features. Furthermore treating assertions is not a day or two's work.

Views?

efucile commented 3 years ago

@blchoy @amilan17 I think that is not safe to make these changes now and I would suggest reverting to the original version used for RC. However, this must be the main discussion of the next meetings to find a proper solution. The monolithic structure of IWXXM needs to be broken if we want to build a sustainable maintenance process. For the moment I assume that we didn't manage to strictly apply the separated versioning, but we need to find a way to do it in the future.

blchoy commented 3 years ago

The monolithic structure of IWXXM needs to be broken if we want to build a sustainable maintenance process ... we need to find a way to do it in the future.

Definitely. We may also want to take into account the updating of the code tables too, as we may have objects which are solely defined on contents of some tables.

amilan17 commented 3 years ago

~Before we pause this effort for FT2021-2 -- how simple/hard is it for schematron to rely on the version attribute instead?~

It looks like it will take a couple more days and some thoughts on refactoring. I agree with closing this PR, continuing the discussion for a future release and moving forward with FT2021-2RC2 branch for this release.

blchoy commented 3 years ago

Before we pause this effort for FT2021-2 -- how simple/hard is it for schematron to rely on the version attribute instead?

I think it is not how simple/hard it is, it is whether it is right to proceed in this way which I think is beyond the normal use of XML features. We may want to go back to see if there is another, more appropriate route to go.

I agree with Anna that we should close this PR now and prepare it for the next version.