one-data-model / language

(Old repo:) Simple Definition Format (SDF) for One Data Model definitions
7 stars 4 forks source link

Design a scheme for ODM mapping files #86

Open mjkoster opened 4 years ago

mjkoster commented 4 years ago

We resolved to move a number of eco- and vendor-specific features into a sort or purgatory (not derogatory) called "mapping files".

The idea is for mapping files to contain the qualities that don't change the basic functionality of the model, especially things like numeric IDs that exist mainly for payload efficiency.

We wouldn't require a uniform scheme across sdo/vendors, but may eventually decide a canonical ODM scheme at some point. For now, we resolved to use the URI as the canonical ODM identifier.

The things included are protocol binding information like:

The format is essentially from URI to ID number, where URI can include the odm namespace and JSON pointer to the specific definition or declaration

e.g.

{
  "id": {
    "wdl:/#odmObject/thermostat": 227,
    "wdl:/#odmProperty/currentTemperature": 291,
    "wdl:/#odmAction/setCoolingSetpoint": 301
  }
}
akeranen commented 4 years ago

Yes, I actually made already a piece of code that can do this for IPSO models. Currently the output looks like this:

{
  "info": {
    "title": "IPSO ID mapping"
  },
  "map": {
    "#/Generic_Sensor": 3300,
    "#/Generic_Sensor/Sensor_Value": 5700,
    "#/Generic_Sensor/Sensor_Units": 5701,
    "#/Generic_Sensor/Min_Measured_Value": 5601,
...

But indeed namespace support is useful to support multiple mappings per file. Also maybe we want to support also other than just ID mapping in the mapping file (and turn it into "protocol binding config file"). Something like:

 "map": {
    "#/Generic_Sensor": {"ID": 3300, "version": "1.1", "oma:multi-instance": true},
    "#/Generic_Sensor/Sensor_Value": {"ID": 5700}
...

So, there could be common key-words like ID and version, and ecosystem specific keywords like oma:multi-instance.

gerickson commented 4 years ago

This doesn't seem like a scalable solution unless it can be distributed. In the case of WDL, we would have a mapping file that was / is as big as the entirety of our body of schema itself—completely untenable and unmaintainable.

/cc/ @mrjerryjohns

akeranen commented 4 years ago

We could enable the mapping to be also in the same file as the rest of the model definition but it’s useful to enable also separate files so that we can have clean common model files that can be annotated with an ecosystem mapping file.

gerickson commented 4 years ago

I'd welcome input from others working on pressure testing; however, this seems like it stops our progress in trying to effect the mechanical WDL to SDF and back to WDL round trip since URL/Is don't exist in WDL relative to numbers not existing in other ecosystems.

/cc/ @mrjerryjohns

akeranen commented 4 years ago

@gerickson note that the keys of the example map object are not any special URIs but JSON pointers referring to names of the SDF properties, actions, etc.

mrjerryjohns commented 4 years ago

Agreed with Grant. This will prevent us from doing 'schema-less', data-only translation between the various ecosystems, which I thought was a core objective of this group.

On Mon, Oct 14, 2019 at 12:31 PM Ari Keränen notifications@github.com wrote:

@gerickson https://github.com/gerickson note that the keys of the example map object are not URIs but JSON pointers referring to names of the SDF properties, actions, etc.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/one-data-model/language/issues/86?email_source=notifications&email_token=ABKSSTIF463JEVUOVCMYWG3QOTCIZA5CNFSM4I7POBC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBGFXBY#issuecomment-541875079, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKSSTKP5CLMISCA75VFAQTQOTCIZANCNFSM4I7POBCQ .

akeranen commented 4 years ago

The ipso-odm code supports now making ID mapping files: https://github.com/EricssonResearch/ipso-odm#ipso-id-mapper

The same code should work with small tweaks also for other XML-based schemas.

mjkoster commented 4 years ago

I would like to propose making "id" an optional quality that can be defined in an ecosystem-specific way. It's a simple step to annotate/extract between embedded IDs and mapping files, so why not allow ID to be embedded in the file as an optional quality at the location indicated by the mapping file. The end semantics are exactly the same.

WAvdBeek commented 4 years ago

Michael, which ID should be used in the file, that was the issue where we started out from.

mjkoster commented 4 years ago

I am proposing that ID is optional and the format is defined by the ecosystem as a protocol binding hint. It would be a convenience for tooling to keep the ID embedded in the file, but otherwise the definition is the same as with mapping files. ODM tooling would not process ID but would allow it to be present for other tooling to process. The default namespace could provide a definition for ID when used in that namespace.

WAvdBeek commented 4 years ago

not sure about that, since the ID is then related to the original copyright of the file. if one uses it for another ecosystem the ID should be translated to that ecosystem. i am not against having the original ID in the file, but I do not think it is that use full.

mjkoster commented 4 years ago

As an ecosystem vendor, I can use SDF and OneDM tooling to manage my ecosystem-specific definitions, and I can embed ecosystem-specific protocol hints into the definition files. The OneDM tooling ignores these hints, but the ecosystem-specific tooling uses them to create "instances" in the target ecosystem.

If I then want to upload these to OneDM for broad use outside my ecosystem, at that point it doesn't matter whether I include a mapping file or leave the hints embedded in the code. There could be an OneDM tool to strip/insert them to and from mapping files.

If I want to import another contributed definition into my ecosystem, I will need to create a set of protocol hints according to my protocol binding scheme and embed them into my ecosystem-specific versions of the definition files (or make mapping files...)

I would definitely expect the embedded protocol information like ID numbers and corresponding mapping files to be losslessly round-trippable.