blchoy / iwxxm-testbedOne

Testbed for new concepts and features of IWXXM
0 stars 1 forks source link

How to properly describe and validate representations in soft-typing #4

Open blchoy opened 5 years ago

blchoy commented 5 years ago

In the current attempt to move to soft-typed representations, we lose the benefit of defining the structure of a representation with schemas. The following are examples of two different features but apart from examples, how can we describe the structure in a formal way?

<SIGWXFeature gml:id="...">
  <FeatureType xlink:href="http://codes.wmo.int/.../JetStream/corePoints"/>
  <Geometry>
    <gml:Curve gml:id="..." srsDimension="2" axisLabels="Lat Long" srsName="http://www.opengis.net/def/ers/EPSG/0/4326">
      <gml:segments>
        <gml:GeodesicString>
          <gml:posList>...</gml:posList>
        </gml:GeodesicString>
      </gml:segments>
    </gml:Curve>
  </Geometry>
</SIGWXFeature>
<SIGWXFeature gml:id="...">
  <FeatureType xlink:href="http://codes.wmo.int/.../JetStream/windSymbols"/>
  <FeatureCollection>
    <SIGWXFeature gml:id"...">
      <!-- FeatureType omitted -->
      <Geometry>
        <gml:Point gml:id="..." srsDimension="2" axisLabels="Lat Long" srsName="http://www.opengis.net/def/ers/EPSG/0/4326">
           <gml:pos>...</gml:pos>
        </gml:Point>
      </Geometry>
      <Attribute xlink:href="http://codes.wmo.int/.../elevation">FL350</Attribute>
      <Attribute xlink:href="http://codes.wmo.int/.../elevation/LowerBound">FL330</Attribute>
      <Attribute xlink:href="http://codes.wmo.int/.../elevation/UpperBound">FL370</Attribute>
      <Attribute xlink:href="http://codes.wmo.int/.../JetStream/coreSpeed">100</Attribute>
    </SIGWXFeature>
    <SIGWXFeature gml:id"...">
      ...
    </SIGWXFeature>
    ...
  </FeatureCollection>
</SIGWXFeature>

And is there a way to automate the creation of schematron rules to validate the structure?

blchoy commented 5 years ago

There is indeed a discussion on this topic: https://confluence.csiro.au/display/seegrid/Strong-+vs+weak-+typing+for+features

moryakovdv commented 4 years ago

I'm not sure it is a right place to fire the flame of huge discussion here, probably we should move it to another place. But still couple of thoughts.

  1. If you have the schema you always know what do you want from file.
  2. If not, you have to find a way to declare what you want. It can be set of formal rules, regular expressions, sequences of instructions in code(java,c, etc...).
  3. If we want to use schematron as validation engine we still need to deal with XPath, XQuery, etc..

But there are ways to reduce creating complex rules by hand using some kind of a preprocessing. For example, based on SIGWXFeature description above we can deal with it's attributes:

a) XML-database mapping. I have found a way to map attributes to record set in Postgresql (but still using xpath expressions):

drop table if exists xml_temp; create table xml_temp(val xml); `insert into xml_temp(val) values ("

FL350 FL330 FL370 100

");`

select (xpath('//Attribute/@*[name()="xlink:href"]',val)) AS "link", (xpath('//Attribute/@*[name()="someProperty"]',val)) AS "someProperty", (xpath('//Attribute/text()',val)) AS "value" FROM xml_temp;

There are plenty of xml oriented database engines, opensourced and proprietary. Most of RDBMS support xml data in some way.

b) using regexp engines we can obtain token=value map: \<Attribute xlink:href="(?:.+)\/(?'token'.+)"\>(?'value'.+)(?=\<\/Attribute\>)

Test here: https://regex101.com/r/uV7s1D/1

Anyway we either deal with well-known(for us :)) xsd+xml+schematron or creating quite new tools.

Happy to discuss it further.

blchoy commented 4 years ago

Thanks @moryakovdv. I think this is the right place to start the debate without much distractions in wmo-im/iwxxm.

To facilitate further discussions I will start to develop some UML models based on the form of schemas discussed so far and see what can be extracted from the associated XMI files. Then we could talk about the UML-to-target code transformation script (likely XSLT) together with the target code as you mentioned above.

blchoy commented 4 years ago

During the development of the WAFC SIGWX model (now available in https://github.com/blchoy/iwxxm-testbedOne/tree/WxObjects), it is noticed that hard-typing and virtual-typing (used in OMXML implementation and also iwxxm:extension) are still useful in describing the overall structure of the "container" of meteorological information.

Soft-typing is most useful in describing properties of MET objects as they can be adequately represented by just one (e.g. pressure) or two (e.g. wind) or multiple (e.g. wind symbol) simple types.

Previously we developed METCE with a view to make it a source of MET features to be used in other application schemas. Currently we have hard-typed tropical cyclone and volcano in METCE but we are already seeing issues with regard to maintenance and evolution of these features. May be it is more appropriate to move them to soft-typed representation and make a soft-typed METCE with the same technology used by the Codes Registry?