eclipse-esmf / esmf-sdk

Load Aspect Models and their artifacts as Java code; share components to realize SAMM as code
https://eclipse-esmf.github.io/esmf-developer-guide/index.html
Mozilla Public License 2.0
24 stars 12 forks source link

[Task] Add descriptive auto generated JSON payloads #491

Open matbmoser opened 8 months ago

matbmoser commented 8 months ago

Introduction to Problem

Using the modeler is really useful for creating semantic data models and other types of information which is organized in a tree. The problem comes when we are handling with this JSON payloads and we want that information like "description", "name" and other meta data travel with the JSON payload.

This can be useful when no internet connection is available and the only information base is the "generated json payload" which just contain keys and some of them have sometimes no specific business domain meaning.

As an application parsing a recursive Json structure can increase the access, storage, transfer and handling of the information. (This would maintain the semantics and the structure defined in the standarization)

In this way we would not need to include second aspects just for the contain the unit. They would come as an property from the data attribute node.

Therefore there needs to be a solution for the problem.

Solution Proposal

"Generate a json payload with a node semantic structure"

Create a node structure that is able to contain more descriptive information which will travel together with the data, and be much more easier to handle in applications instead of forcing to parse the Turtle file or using Open Spec apis that required HTTP Requests (access to internet).

In my understanding would be good to have a json which can contain a generic node structure including "name", "description", and other attributes. Also would be important to search for this attributes because of this a "ref" for the path to this node in tree.

This would be my proposal:

image

I even created a python script that can transform any json into a generic tree structure and can be consulted here: https://github.com/matbmoser/dpp-generator

Extra Features

Additional context

In the Digital Product Pass application from Catena-X we were facing some issues when handling just with the "generated json payload". We dont want to include the Open Spec Api to our application therefore a solution like this would be very valuable and would allow generated payloads from standards to continue with their original names while still containing the necessary data and simple parsing keys.

Examples:

Actual Payload [Bom As Built]

This model generated using the tool give this JSON payload:

https://github.com/eclipse-tractusx/sldt-semantic-models/blob/main/io.catenax.single_level_bom_as_built/2.0.0/gen/SingleLevelBomAsBuilt.json


{
    "catenaXId": "urn:uuid:055c1128-0375-47c8-98de-7cf802c3241d",
    "childItems": [
        {
            "catenaXId": "urn:uuid:055c1128-0375-47c8-98de-7cf802c3241d",
            "quantity": {
                "quantityNumber": 2.5,
                "measurementUnit": "unit:litre"
            },
            "hasAlternatives": true,
            "createdOn": "2022-02-03T14:48:54.709Z",
            "businessPartner": "BPNL50096894aNXY",
            "lastModifiedOn": "2022-02-03T14:48:54.709Z"
        }
    ]
}

Descriptive Auto Genrated Payload [BomAsBuilt]

Using the script proposed:

{
  "id": "urn:samm:io.catenax.single_level_bom_as_built:2.0.0#BomAsBuilt",
  "label": "Bom As Built Aspect",
  "description": "The Bill of Materials Aspect in lifecycle phase as Built",
  "type": {
    "unit": null,
    "datatype": "object"
  },
  "ref": "/",
  "audit": {
    "created": 1702471917.772977,
    "createdBy": "user1",
    "updated": 1702471917.772977,
    "updatedBy": "user1"
  },
  "data": {
    "catenaXId": {
      "id": "catenaXId",
      "label": "Catena-X Id",
      "description": "The Catena-X represents an unique Id in all the AAS and identifies the asset globally. Also may be called globalAssetId",
      "type": {
        "unit": null,
        "datatype": "string"
      },
      "ref": "/catenaXId",
      "audit": {
        "created": 1702471917.772977,
        "createdBy": "user1",
        "updated": 1702471917.772977,
        "updatedBy": "user1"
      },
      "data": "urn:uuid:055c1128-0375-47c8-98de-7cf802c3241d"
    },
    "childItems": {
      "id": "childItems",
      "label": "Child Items",
      "description": "The list child items related to the AAS digital twin. Allowing the drill down of components",
      "type": {
        "unit": null,
        "datatype": "array"
      },
      "ref": "/childItems",
      "audit": {
        "created": 1702471917.772977,
        "createdBy": "user1",
        "updated": 1702471917.772977,
        "updatedBy": "user1"
      },
      "data": {
        "0": {
          "id": "0",
          "label": "First child aspect",
          "description": "",
          "type": {
            "unit": null,
            "datatype": "object"
          },
          "ref": "/childItems/0",
          "audit": {
            "created": 1702471917.772977,
            "createdBy": "user1",
            "updated": 1702471917.772977,
            "updatedBy": "user1"
          },
          "data": {
            "catenaXId": {
              "id": "catenaXId",
              "label": "Catena-X Id",
              "description": "The Catena-X represents an unique Id in all the AAS and identifies the asset globally. Also may be called globalAssetId",
              "type": {
                "unit": null,
                "datatype": "string"
              },
              "ref": "/childItems/0/catenaXId",
              "audit": {
                "created": 1702471917.772977,
                "createdBy": "user1",
                "updated": 1702471917.772977,
                "updatedBy": "user1"
              },
              "data": "urn:uuid:055c1128-0375-47c8-98de-7cf802c3241d"
            },
            "quantity": {
              "id": "quantity",
              "label": "Quantity",
              "description": "Quantity of elements produced in the lifecycle",
              "type": {
                "unit": null,
                "datatype": "object"
              },
              "ref": "/childItems/0/quantity",
              "audit": {
                "created": 1702471917.772977,
                "createdBy": "user1",
                "updated": 1702471917.772977,
                "updatedBy": "user1"
              },
              "data": {
                "quantityNumber": {
                  "id": "quantityNumber",
                  "label": "Quantity Number",
                  "description": "The actual value of the quantity produced in the lifecycle",
                  "type": {
                    "unit": "unit:litre",
                    "datatype": "float"
                  },
                  "ref": "/childItems/0/quantity/quantityNumber",
                  "audit": {
                    "created": 1702471917.772977,
                    "createdBy": "user1",
                    "updated": 1702471917.772977,
                    "updatedBy": "user1"
                  },
                  "data": 2.5
                },
                "measurementUnit": {
                  "id": "measurementUnit",
                  "label": "Measurement Unit",
                  "description": "The unit used in the measurament",
                  "type": {
                    "unit": null,
                    "datatype": "string"
                  },
                  "ref": "/childItems/0/quantity/measurementUnit",
                  "audit": {
                    "created": 1702471917.772977,
                    "createdBy": "user1",
                    "updated": 1702471917.772977,
                    "updatedBy": "user1"
                  },
                  "data": "unit:litre"
                }
              }
            },
            "hasAlternatives": {
              "id": "hasAlternatives",
              "label": "Has Alternatives",
              "description": "Indicates if the digital twin produced has alternatives",
              "type": {
                "unit": null,
                "datatype": "boolean"
              },
              "ref": "/childItems/0/hasAlternatives",
              "audit": {
                "created": 1702471917.772977,
                "createdBy": "user1",
                "updated": 1702471917.772977,
                "updatedBy": "user1"
              },
              "data": true
            },
            "createdOn": {
              "id": "createdOn",
              "label": "Created On",
              "description": "Data Time of AAS Creation",
              "type": {
                "unit": null,
                "datatype": "string"
              },
              "ref": "/childItems/0/createdOn",
              "audit": {
                "created": 1702471917.772977,
                "createdBy": "user1",
                "updated": 1702471917.772977,
                "updatedBy": "user1"
              },
              "data": "2022-02-03T14:48:54.709Z"
            },
            "businessPartner": {
              "id": "businessPartner",
              "label": "Business Partner Number",
              "description": "Catena-X Unique Business Partner Number identifier",
              "type": {
                "unit": null,
                "datatype": "string"
              },
              "ref": "/childItems/0/businessPartner",
              "audit": {
                "created": 1702471917.772977,
                "createdBy": "user1",
                "updated": 1702471917.772977,
                "updatedBy": "user1"
              },
              "data": "BPNL50096894aNXY"
            },
            "lastModifiedOn": {
              "id": "lastModifiedOn",
              "label": "Last Modified On",
              "description": "Date of last modification",
              "type": {
                "unit": null,
                "datatype": "string"
              },
              "ref": "/childItems/0/lastModifiedOn",
              "audit": {
                "created": 1702471917.772977,
                "createdBy": "user1",
                "updated": 1702471917.772977,
                "updatedBy": "user1"
              },
              "data": "2022-02-03T14:48:54.709Z"
            }
          }
        }
      }
    }
  }
}
BirgitBoss commented 7 months ago

When it comes to add meta information into the JSON payload itself then we should use the Asset Administration Shell (AAS) specification for payload "Normal".

So far for SAMM only the "ValueOnly"-Payload of the AAS is supported. The SDK should supported also the other (look for "SerializationModifier", chapter 11.2 in Part 2.

image

In chapter 11.4.2 an example showing difference between ValueOnly-Format and Normal is shown.

Note: the aspect model would be referenced as semanticId in the Submodel. Normally the description of the concept desription (=aspect model) is not copied to the description of the element itself (but would be possible if needed).

image

atextor commented 7 months ago

Hi @matbmoser, thank you for the proposal and the extensive description. The JSON payload structure that corresponds to an Aspect Model is described in the Payloads section, unfortunately the reasoning behind this structure is currently not well described; there is an issue to properly document this: https://github.com/eclipse-esmf/esmf-semantic-aspect-meta-model/issues/277.

Let me summarize the reasoning. SAMM builds on the abstraction of the OMG MOF (Object Management Group Meta Object Facility, i.e., ISO/IEC 19502:2005 / ISO/IEC 19508:2014) that clearly distinguishes the four meta model levels M0, M1, M2 and M3. M0 means instance data, M1 is the model level, M2 meta model level and M3 is the meta meta model. The following image should clarify this in the context of SAMM and Aspect Models:

image

Formats such as OWL and in particular, JSON-LD, intentionally or accidentally, lead to documents that mix up M0 and M1 levels. This can lead to various problems, especially when M0 and M1 are provided by different parties. SAMM therefore clearly separates between those meta model levels (note: there are exceptions, more on that later¹). This is the approach that is for example also taken by JSON Schema (i.e., JSON documents don't duplicate information that is part of the corresponding JSON Schema). SAMM itself corresponds to the M2 level - it's the meta model that describes the M1 level - the Aspect Models. Aspect Models semantically and structurally describe the data on the M0 level.

The approach for making use of an Aspect's M0/JSON data is explicitly to incorporate the corresponding Aspect Model. In this respect, it's not much different to what JSON-LD does, or what is proposed in this issue: You want to have the data associated with the model to make use of both at the same time - at some point. However, the approach of how the client, who's using the data, gets hold of the information in the model is different. We assume that the client has access to the Aspect Model it wants to work with anyways - this could be the CatenaX Semantic Hub, but it could also just be bundled with the client application. Therefore, making model information part of the JSON payload is redundant.

By expliclity separating M0 and M1 and have the Aspect return only the M0 data has several benefits:

  1. It allows semantically describing existing data and APIs without making it necessary to change their data model/structure, implementation or API. In many cases this is not possible, for example in transitional periods (APIs are in the process of being semantically described), APIs provided by third parties, or with data at rest (data in an existing data storage).
  2. Neither the client author nor the aspect (server) author are forced to use Aspect Models. It allows flexible combination of semantic clients and semantic APIs (i.e., making use of Aspect Models to describe data) with plain clients and plain APIs (speaking plain JSON without any semantic description). All of the following scenarios are possible:
    1. Semantic client consumes data from semantic API - both agree on the same Aspect Model
    2. Semantic client consumes data from plain API - only the client knows about the Aspect Model
    3. Plain client consumes data from semantic API - only the Aspect knows about the Aspect Model
    4. Combinations thereof (e.g. both plain and semantic clients consume data from the same Aspect)
  3. It allows to map multiple versions of an Aspect Model to the same M0 data, as long as the data structures are identical. For example, samm:descriptions in additional languages, or more thorough descriptions of value ranges using samm-c:Enumeration could be added to the model and trigger a new version release of the model, but this does not change the data. Details are described in Model Evolution.
  4. It retains the flexibility to add all the necessary meta data that you need on consumer side as regular model elements, without the meta model hardcoding or dictating exactly which attributes you need.²
  5. It removes redundancy and reduces data sizes of payloads. This is arguably a weak argument, it's nevertheless a positive side effect.

Then, how is a semantic client (more specifically: the developer of a client) supposed to make use of the Aspect Model if the model information is not part of the runtime data? There are multiple possible approaches here:

Additional notes: ¹: The Unit Reference Characteristic kind of permeates the M0-M1 boundary, by referring to a model level element (a unit) from runtime data. However, this is not critical, since for all practical purposes, the unit catalog can be considered one big enumeration. All additional information (such as the unit's symbol, description or ISO code) are still not included in the runtime data.

²: Since you seem to be interested in adding audit/provenance meta data into your data, my suggestion is to express this in terms of regular SAMM Properties, Entities and Characteristics. In my opinion, this is a regular use case of domain data that you want to express, in other words: Just because your use case requires this, not every use case does, so it does not make sense to change the model-to-data mapping for this use-case-specific (or domain-specific if you will) requirement. When you want to include this "necessary data", make it part of your Aspect Model. Note that it would make sense to express such kinds of model elements in a shared namespace and re-use them in multiple Aspect Models (if you have the requirement for multiple Aspect Models with that meta data, that is).

Due to these reasons, I'd opt against changing the current JSON mapping. Please let me know what you think.

matbmoser commented 6 months ago

Thank you very much for the answer! I understand it know really good how it works.

I am glad that you took my proposal seriously. I hope it had enlightened the problem and helped you to understand what I meant.

Have a nice day!

Best Regards,

Mathias Moser