International-Data-Spaces-Association / InformationModel

The Information Model of the International Data Spaces implements the IDS reference architecture as an extensible, machine readable and technology independent data model.
Apache License 2.0
64 stars 37 forks source link

ids:Artifact and resourceEndpoints #481

Open tomkxy opened 3 years ago

tomkxy commented 3 years ago

While looking into mapping from DCAT to IDS I stumbled across an aspect in IDS which feels cumbersome and to be honest I do not understand: In DCAT we have Datasets and Distribution which relates to IDS as DataResources and DataRepresentations / Artifacts , so far so good. In DCAT, the Distribution contains the information where to get the distribution from downloadURL etc.

Now trying to figure out from an IDS Dataresource where I can retrieve the resource from: On the one hand side each data resource may have many data representations and many artifacts. Thus, I would have expected to have a prop on the artifact to the endpoint where I can get the artifact from.

Instead, there is a another structure on data resource level, the ids:resourceEndpoint which holds the artifact again (same as above) and an Endpoint (e.g. ConnectorEndpoint). I don't understand, why this is modeled like that, since the endpoints seems to be clearly related to the artifact. Btw. this makes life cycle management of a resource on a GUI rather complicated.

clange commented 3 years ago

My initial understanding is the following. @HaydarAk could you please add your (probably deeper) understanding to this discussion? Generally, one resource may be served through one or more endpoints, i.e., ids:Resource – ids:resourceEndpoint → ids:ConnectorEndpoint. In certain special cases, an endpoint may be dedicated to serving one artifact (i.e., one instance) of one representation of the resource, but this is optional to model explicitly: ids:ConnectorEndpoint – ids:endpointArtifact → ids:Artifact.

Now @tomkxy for your use case: Are you assuming that you found, e.g., in a Connector's Catalog, is the resource, and the resource has representations (ids:DigitalContent – ids:representation → ids:Representation), which have artifacts (ids:Representation – ids:instance → ids:RepresentationInstance), and then for one specific artifact A you would like to know where you can download that? And you wouldn't want to query the whole metadata graph for the ids:ConnectorEndpoint that serves A, but would prefer following a direct link from A to that endpoint? – This should be feasible to implement. @HaydarAk what do you think?

HaydarAk commented 3 years ago

All correct what you wrote, @clange. One addition: As far as I know, the ids:resourceEndpoint property is especially relevant for, e.g., the broker, because it is the one and only property which allows to link a resource to a corresponding connector (endpoint), if queried at a broker. It should definitely be feasible to implement that. But we have to consider "what" to describe "where" at "which" level of detail to satisfy all requierments.

In DCAT, the distribution contains the information where to get the data from downloadURL etc.

[...] and then for one specific artifact A you would like to know where you can download that? And you wouldn't want to query the whole metadata graph for the ids:ConnectorEndpoint that serves A, but would prefer following a direct link from A to that endpoint?

On high level, this sounds good. But we might have to look into it in detail. We could add "access-related" information to Representations and therefore point to ConnectorEndpoints, similar to the ids:resourceEndpoint property of ids:Resource.

This is some extent similar to the DCAT approach. DCAT distributions refer to:

By looking into the dcat:DataService class I noticed that the class does not contain information about a distribution but a dataset ( see here ). In IDS terms this would translate into:

ids:Representation or ids:Artifact --> served by --> ids:Connector, ideally ids:ConnectorEndpoint ids:ConnectorEndpoint --> serves resource --> ids:Resource

I am not 100% sure, if a full adoption of the DCAT approach is something we want because it is a very data-centric model.

Feel free to share your thoughts :)

tomkxy commented 2 years ago

Thanks for the explanation and sorry my late reply. Looking at the topic again while our devs are trying to do implementation on the GUI, we realize the following. Let's assume the following use case: You have a data resource with a couple of representations each representation having one artifact. Now you want to display this on a GUI utilizing the generated Java classes.

You display the attributes of representation and artifact by following from the resource-> representation (via ids:representation) -> artifact (via ids:instance). So far so good. But what you do now to display with each representation for instance accessURLs which are stored in the connectorEndpoint. How do you get that info for a specific representation?

If I am not completely mistaken you need to find the connectorEndpoint instance which is pointing through the artifact to the representation given. With a SPARQL query that is not big deal because you can navigate the graph, but please try that with the Java classes which are generated out of the infomodel. Here the relations are kind of directed and there is no easy way to figure out all properties related to a specific representation / artifact, including the properties from the resourceEndpoint which is serving this artifact.

I hope, I could explain the issue clearly

sebbader commented 2 years ago

Hello together, after a short discussion with @tomkxy I feel finally qualified to also put my two cents. I think the problem is that as soon one has traversed to a Representation or Artifact, the only way to get the accessUrl by following an edge backwards. This is possible via SPARQL but not - for instance - the Java implementation where only one-way lookups are possible. We had the property at the Artifact already: https://github.com/International-Data-Spaces-Association/InformationModel/blob/018abfe8a1e96ac842cccc99d4ae1dddfdc838cc/model/content/Artifact.ttl#L100

I am not saying to reintroduce it here, I just want to point out that we saw the necessity already earlier :-).

sebbader commented 2 years ago

I don't have a good idea yet. Adding redundant (potentially conflicting) attributes doesn't sound good to me.

JohannesLipp commented 2 years ago

@tomkxy sorry for the late reply. Could you please comment if the issue still remains and still needs to be tackled? Thank you in advance!