admin-shell-io / aas-specs-api

Repository of the Asset Administration Shell Specification DTA-01002 API

https://admin-shell-io.github.io/aas-specs-antora/index/home/index.html

Creative Commons Attribution 4.0 International

12 stars 5 forks source link

REST API Fragements to access information from embedded files following standards (AML, XML, ZIP) #286

Open BFrKUKA opened 4 months ago

BFrKUKA commented 4 months ago

What is missing?

It shall be possible for a REST Client to access data of embedded files in AAS Submodels to avoid double modelling, double data storage and inconsitency.

How should it be fixed?

See attached presentation

2024-05-13-AASX_REST_API_Fragments_v3.pptx

Supported by Matthias Freund, Arndt Lueder and Andreas Orzelski.

sebbader-sap commented 4 months ago

I am trying to create a minimal example:

The Submodel "example" has one SubmodelElement of type File with the idShort="example-file"
This SubmodelElement links(!) to to an XML document that has an XML object "ex:object"
We want to link to this object directly, e.g., like GET /submodels/ZXhhbXBsZQ/submodel-elements/example-file<some-fragment-separator>ex:object

Note: "ZXhhbXBsZQ" is the base64url encoded Submodel/id value ("example")

sebbader-sap commented 4 months ago

Question 1: Why not using the URI fragment approach, therefore something like GET /submodels/ZXhhbXBsZQ/submodel-elements/example-file#ex:object?

To be clarified: '#' is not an allowed character for idShorts/IdShortPaths. Is there any chance nevertheless to construct a submodel for which this proposal would be ambiguous? Related to this: A fragment extension must be unambiguous at all times, meaning, a server must always know for sure that the client is requesting a fragment and not a SubmodelElement.

See also: https://example.com/my-file#my-section as e.g. used in Github Docs

sebbader-sap commented 4 months ago

Question 2: Type-safety: How can the client know in advance which data type is returned?

sebbader-sap commented 4 months ago

Question 3: For a SubmodelElement File, the server may only know the link but have no access to the resource document itself. How to deal with these kind of scenarios?

sebbader-sap commented 4 months ago

Question 4: Given that a repository contains many (millions) of submodels, accessing the content of files seems to be a very expensive operation both at runtime but also for the developers of the repository service. Is the added value worth this?

sebbader-sap commented 4 months ago

Question 5: The PowerPoint suggests even more sophisticated patterns, e.g., XPath. It is very hard to estimate the computational complexity of such a statement in advance. How can a server protect itself if such patterns are allowed?

sebbader-sap commented 4 months ago

Question 6: Serialisation Modifier: Please have a look at the V3 of the API specification. Modifiers are not longer defined as query parameters but at the end of the URI path (?content=path--> .../$path). I am not sure if this can cause issues.

de-ich commented 4 months ago

Question 1: Why not using the URI fragment approach, therefore something like GET /submodels/ZXhhbXBsZQ/submodel-elements/example-file#ex:object?

To be clarified: '#' is not an allowed character for idShorts/IdShortPaths. Is there any chance nevertheless to construct a submodel for which this proposal would be ambiguous? Related to this: A fragment extension must be unambiguous at all times, meaning, a server must always know for sure that the client is requesting a fragment and not a SubmodelElement.

See also: https://example.com/my-file#my-section as e.g. used in Github Docs

One problem I see here is that it is not clear what syntax the URI fragment uses. The idea behind the fragments is that they can be used to retrieve information from different underlying technologies (AML, XML, ...) where each technology may specify its own syntax used to point to elements within the file (e.g. CAEX paths for AML, xPath for XML, ...). For the parser to be able to distinguish between the different technologies, it makes sense to explicitly denote the used fragment type. This is why we explicitly included the in the API.

de-ich commented 4 months ago

Question 2: Type-safety: How can the client know in advance which data type is returned?

Good point. This is distinguished by the specified in the call. If the is set to AML, I know that I will retrieve an "AML object". What "AML object" means/what different kinds of "AML objects" exist needs to be specified for this specific fragment-type in the standard.

In my opinion, this is comparable with the existing REST API: If I use the request ".../submodel-elements/id-short-path", I know that I will retrieve a SubmodelElement. The response then tells me what kind of SME it returned (Property, SMC, ...).

de-ich commented 4 months ago

Question 3: For a SubmodelElement File, the server may only know the link but have no access to the resource document itself. How to deal with these kind of scenarios?

That's true. I would just have handled such cases with specific error codes, e.g. return the error code returned by the server while trying to retrieve the document.

de-ich commented 4 months ago

Question 4: Given that a repository contains many (millions) of submodels, accessing the content of files seems to be a very expensive operation both at runtime but also for the developers of the repository service. Is the added value worth this?

This is certainly something to be discussed. For me, being able to access information in such a way however is a key part of interoperability between different technologies and thus for sure worth the implementation effort. In addition, the fragment API can be implemented in its own Interface so that servers can decide whether they need/want to support it or not, just like with the other existing interfaces (discovery, etc.).

de-ich commented 4 months ago

Question 5: The PowerPoint suggests even more sophisticated patterns, e.g., XPath. It is very hard to estimate the computational complexity of such a statement in advance. How can a server protect itself if such patterns are allowed?

Also a good point and something that I am not yet 100% sure on. This is potentially dependent on (1) the size of the file to be loaded (into memory) and (2) the complexity of the query itself. I do not know if it is possible to estimate this complexity beforehand. However, if we rely on established standards like AML and Xpath, this should for sure help as there are already standard implementations that can be used (like I do in my prototypical implementation).

de-ich commented 4 months ago

Question 6: Serialisation Modifier: Please have a look at the V3 of the API specification. Modifiers are not longer defined as query parameters but at the end of the URI path (?content=path--> .../$path). I am not sure if this can cause issues.

Correct. I already use the new pattern in my implementation and also updated the presentation when preparing the V3 version in most of the cases. However, I overlooked the "path" modifier on slide 8 as stated be you. This is fixed in this new version of the presentation:

2024-05-13-AASX_REST_API_Fragments_v3.pptx

sebbader-sap commented 2 months ago

See also from the metamodel section on References: https://admin-shell-io.github.io/aas-specs-antora/IDTA-01001/v3.1/spec-metamodel/referencing.html :