eclipse-tractusx / sig-release

https://eclipse-tractusx.github.io/sig-release
Apache License 2.0
8 stars 8 forks source link

EDC - Soft equivalence between ODRL Policies #735

Open DanielaWuensch opened 2 months ago

DanielaWuensch commented 2 months ago

Problem: A contract negotiation currently only succeeds if the consumer replays the policy in the exact manner it was discovered in the catalog. However, there may be several semantically equivalent ways to say the same thing. For instance, the order of AtomicConstraints in a LogicalConstraint "odrl:and" should not be relevant while "odrl:andSequence" should.

Proposed Solution: Similarly, a single AtomicContraint has the same truth table as the same AtomicContraint wrapped in "odrl:or" which currently would not be recognized. According to odrl, policies and constraints may have identifiers that (if unequal) would also cause a contract negotiation to fail.

Impacted components

jimmarino commented 2 months ago

@arnoweiss and I discussed this a bit. This should likely be handled holistically in the DSP spec rather than focusing on workarounds involving collection semantics that will not be interoperable. One way to handle this holistically would be to define an algorithm in the DSP specification for handling policy comparisons based on constructing a canonical serialized form. This same feature could then be used for contract signing.

I think it would be helpful to outline the requirements here. These can be presented to the DSP specification group, which can then decide on the appropriate technical approach (assuming the requirement is accepted).

stephanbcbauer commented 1 month ago

Presented in the DRAFT Feature Freeze -> Committer is available

lgblaumeiser commented 1 month ago

@jimmarino : Please raise the issue in the Eclipse Dataspace Working Group

arnoweiss commented 1 month ago

I agree that's what's needed long-term. A canonicalization algorithm could specify rules like "sort sets alphabetically and leave ordered lists as-is".

Currently however, EDC doesn't implement the semantics that odrl:and defines - the implementation is sensitive to ordering. This is appropriate behavior for odrl:andSequence which isn't interpreted.

jimmarino commented 1 month ago

I agree that's what's needed long-term. A canonicalization algorithm could specify rules like "sort sets alphabetically and leave ordered lists as-is".

Currently however, EDC doesn't implement the semantics that odrl:and defines - the implementation is sensitive to ordering. This is appropriate behavior for odrl:andSequence which isn't interpreted.

I think we should be specific here: I believe EDC evaluates and correctly. The issue is how Policy equivalence is handled during a contract negotiation, correct?

Also, if there is an error thrown, someone familiar with it should open a discussion in the EDC repo (not issue) with a stacktrace so the committers can discuss.

lgblaumeiser commented 1 month ago

@DanielaWuensch @arnoweiss Can you help us here, what is the exact requirement, that you see here. Some acceptance criteria that helps us to understand what you want to have changed?

The idea of the team currently is, to have a canonicalization in DSP spec on policies.

DanielaWuensch commented 1 month ago

@hemantxpatel will provide a detailed description including Payload descriptions and exceptions and link it here. This was discussed in tractus-X edc sync on Aug 6, 2024.

jimmarino commented 1 month ago

@hemantxpatel will provide a detailed description including Payload descriptions and exceptions and link it here. This was discussed in tractus-X edc sync on Aug 6, 2024.

We are looking for a business case for why this is needed, not what triggers it. Understanding the non-technical motivation will help us assess the most appropriate technical approach. Also, this is an upstream EDC issue, not a TX-EDC one.

As I mentioned, "soft-equivalence" is a very slippery slope and not one I think we should go down. If what is required is a reliable way to determine the equivalence of an "infoset" that may have multiple serialized forms, the proper way to solve this is in a general (and interoperable) way at the specification level (DSP) using a canonicalization algorithm, as this will serve multiple purposes. Introducing piecemeal equivalence rules will lead to errors and edge cases that are difficult to reason about and will not be interoperable with other connector implementations.

However, we should first understand what the non-technical motivation is before discussing possible technical solutions.

DanielaWuensch commented 2 weeks ago

During the negotiation process between two connectors, the policies defined in the data provider connector must be compared with the acceptable policies defined by the data consumer. It must be possible to do this comparison automatically. Therefore, the soft equivalence between equal policies from a business point of view must be ensured. So, the idea is that a single AtomicContraint has the same truth table as the same AtomicContraint wrapped in "odrl:and" or in an "odrl:or".

jimmarino commented 2 weeks ago

During the negotiation process between two connectors, the policies defined in the data provider connector must be compared with the acceptable policies defined by the data consumer. It must be possible to do this comparison automatically. Therefore, the soft equivalence between equal policies from a business point of view must be ensured. So, the idea is that a single AtomicContraint has the same truth table as the same AtomicContraint wrapped in "odrl:and" or in an "odrl:or".

We are talking in circles here. I understand the technical issues very well, that is not the issue. My point is this twofold:

  1. The issue is with upstream EDC, not TX-EDC. The proposed solution will not be accepted by the EDC committers because the general consensus of those who have looked at it is that it is not a good one based on the reasons we have already given.

  2. Our (EDC and DSP specification committer) technical opinion is that the correct solution must be achieved at the specification level and will likely involve the adoption of a canonicalization algorithm. We have already laid out our reasoning for this.

Again, no motivation has been provided to justify another approach. For example, what does "the policies defined in the data provider connector must be compared with the acceptable policies defined by the data consumer" actually mean? As a consumer, I could choose to only select a subset of policies. Should the data provider accept that?

This sounds as if the request arises from some form of tooling issue: policies may be read on a client and serialized back to the provider DSP in a different form. At this point, I don't see any way forward other than to follow the recommendation of the EDC and DSP committers and raise this as a DSP specification issue in the form of a request for the adoption of a canonicalization algorithm.