ODRL Evaluator pseudocode

w3c / odrl

ODRL Community Group Repository

https://www.w3.org/community/odrl/

Other

18 stars 9 forks source link

ODRL Evaluator pseudocode #76

Open AndreaCimminoArriaga opened 1 week ago

AndreaCimminoArriaga commented 1 week ago

Following my comment on issue #67, the formal semantics, or a new document, should provide a pseudocode for the evaluation of policies. IMHO a starting point could be the pseudocode of the article presented in the last general meeting from the ODRE article.

joshcornejo commented 1 week ago

We should first agree on what are the input parameters and what are the output parameters, and a separation from the state of the world.

As discussed in the conversation during your presentation of ODRE - I am implementing an evaluator and consider unnecessary embedded extra decorations to represent the state of the world or input parameters.

In general, and as a baseline:

The "evaluation request" will look like this: [ actor, action, asset ]
The "evaluation response" will look: [ activation, decision ]

(I've added extra for practical interface with developers [ activation, decision, [timestamps], [ message list ] ], but those could be considered optional for the proposal)

The evaluation logic/pseudocode from the top starts should look something like:

world.Evaluate ( actor, action, asset )
  foreach aggreement in ListOfAgreements
    if agreement IS Valid AND Active
      aggrement.evaluate ( actor, action, asset )
    end
  end
end
// ------------------------------------------------------------------------------------------
aggrement.Evaluate ( actor, action, asset )
    if agreement IN ListOfTarget
      aggrement.evaluate ( actor, action, agreement )
    foreach rule in agreement
      if rule contains [ actor, action, asset ]
        temporaryResult = rule.Evaluate(actor, action,asset)
        if (( rule IS Permission AND 
              temporaryResult.Activated IS true AND 
              temporaryResult.Decision IS true)                      // you are "permitted" 
              OR 
              ( rule IS Prohibition AND 
              temporaryResult.Activated IS true AND 
              temporaryResult.Decision IS false ))                   // you are "prohibited"

                //Here you deal with the relationships for duty, remedy and consequence

        end 
        add temporaryResult to a list 
    end
  end
// here "coalesce" all the decisions 
return finalResult
end

You could consider if you want the "Result" type to be a monad or just a structure (I am letting errors be structurally handled by the HTTP layer and errors, whilst ODRL semantic errors are wrapped inside the [ message list ] but using HTTP status codes paired with text:

{
  "timestamp": "2024-10-22T09:57:17.608293Z",
  "id": "err_403",                                   // HTTP error code
  "message": "did:odrlw:671776FB:0192b3a8-c603-74f8-87ce-8a65afd5ff09 - action {use} not found."
}

I'll leave the next layer (rule evaluation) for now.

AndreaCimminoArriaga commented 1 week ago

The pseudocode don't seem that far from the one presented in the ODRE article. However there are things that should be formally defined as they are know a reader does not know what they are (e.g., world.Evaluate what is world? an object? a dic?... or how do you formalize 'agreement IS Valid AND Active'). On the other hands, there are personal choices that I do not agree or fully understand, for instance, 'rule' seems to be a list of actor, action,asset however at some place the constraints should appear, also, how the actor is formalized?.

I think is great we have more pseudocodes, we should gather to set first the formalization and then work on the pseudocode to be standardized for the evaluation. Without an agreement of the evaluation flow and the formalization of its elements it will be difficult to agree on the pseudocodes

joshcornejo commented 1 week ago

I think "What is a world" and the rest of the questions in the first paragraph should be defined in https://w3c.github.io/odrl/formal-semantics/ as part of a glossary.

IMHO, aligning with practical authorisation implementations, and to keep consistency with everything:

An actor should be an odrl:Party
An action ... an odrl:Action (or a descendant in case of an extension in a profile)
An asset should be an odrl:Asset

For example, AuthZen uses slightly different terminology:

The Access Evaluation request is a 4-tuple constructed of the four previously defined entities:

subject: REQUIRED. The subject (or principal) of type Subject (equivalent of odrl:Party)
action: REQUIRED. The action (or verb) of type Action. (equivalent of odrl:Action)
resource: REQUIRED. The resource of type Resource. (equivalent of odrl:Asset)
context: OPTIONAL. The context (or environment) of type Context (they have an option for "client-based" state of the world)

The semantics should focus on "What is the purpose of a policy?":

A policy is the description of the possible intent (or execution, depending upon the "modality"?) of an "action" (odrl:Action), by "someone" (odrl:Party as an odrl:assignee) into a "target" (odrl:Asset). There might be other elements (like different functions) but those are only relevant when the odrl:assignee has been granted the ability of "intent or execute" and a rule has been "activated" and "triggered".

AndreaCimminoArriaga commented 1 week ago

I agree with this: "I think "What is a world" and the rest of the questions in the first paragraph should be defined in https://w3c.github.io/odrl/formal-semantics/ as part of a glossary." But for me is not part of the glossary but a section by itself. I'm not sure it necessary needs to go to formal semantics, since this is not trivial and will take space in the text and time that's why I suggested a new drafting document. This definitions should be cristal clear otherwise it will be very difficult to have an agreement on the evaluation. Also, these definitions must be derived from the current RDF representation of ODRL and not custom ones based on other standards and even less implementations (you can have one concept but multiple implementations, following implementations IMHO just increases heterogeneity in the formalization), this comment is linked to the next paragraph.

"IMHO, aligning with practical authorisation implementations, and to keep consistency with everything:" I not fully agree, I think for sure we should keep an eye on other similar proposals but we need to keep the other in the current ODRL practical use cases. Otherwise we can stick to practical authorisation implementations and leave behind scenarios like monitoring. A clear example is what you mention of AuthZen, the context is defined as the client-based state, however ODRL may have broader scopes (like service information, or third-party APIs information) and that kind of data is encoded as part of the constraints. Following the AuthZen we loose the constraints (I did not saw them in the pseudocode you posted). Therefore how can we evaluate if the current dateTime is before or after another? On the other hand, how constraints are formalized and used in the pseudocode you posted? since, IMHO, they are a core part of ODRL.

I understand that your pseudocode checks whether an actor can execute the action related to a target. However, elements like temporaryResult = rule.Evaluate(actor, action,asset) needs to check those constraints that do not appear to compute a temporal result, right?

Take this as an initial idea, IMHO a policy should be formalize as the tuple (a, t, s, C) where a is the URI that identifies an actor, t is the uri that identifies a target, and s the uri that identifies an assignees and C is a set of constraints. Each constraint is also a tuple (o, l, r) where o is the URI of the operator, and l, r are either a set of triples or a term from the ontology that refers to a left or right operand respectively. In the ODRE article we have already prove that there is a direct transformation from a policy expressed in Turtle or JSON-LD 1.1 to these definitions. What do you think? could this formalization fit your use case as well? I see little differences in terms of information in the pseudocode you posted, maybe just lack of more formalization and the set C.

Of course there are things unsolved:

How refinements can fit this definition?
Is the tuple (a, t, s, C) sufficient for all scenarios? I think it may fit, and then, monitor or access based use cases are more related to how the policies are evaluated (maybe a sequence diagram may shed some light about this issue).
Validation should play an important role in the evaluation. We defined an action as a URI, is this URI correct syntactically, is it true that it belongs to a valid ODRL action (instance or sub-class), is this action enforceable, i.e., it can be executed as part of the evaluation.

IMHO It would be great if in this thread we can find a common consensus on how to formalize the policies

joshcornejo commented 1 week ago

I didn't go a level below the rules - because those seem to have different behaviours:

Rule constraint - focuses on the wide scope of activation of the rule.
Action refinement - this one narrows conditions of when the action is "actionable".
AssetCollection / PartyCollection refinement: creates an applicable "subset".

And I think keeping it layered would make it clear ("C" is a separate section).

I also separated validation and evaluation ("is it correct" vs "does it work") - in practice, validation takes time and has no real connection with the process of evaluation (i.e. only evaluate valid policies).

There are also differences in how the "world" interacts with "the state" and how "the state" is input to an evaluation. For example, I might set a value of an odrl:LeftOperand once and use it in multiple evaluations over time (as it is part of the "state"), or the operand (like dateTime) is managed by "the state" or it is a parameter (like "actor" in the case of an odrl:recipient)

joshcornejo commented 1 week ago

I've had it open for ages, but just started reading @simonstey document If You Can’t Enforce It, Contract It: Enforceability in Policy-Driven (Linked) Data Markets, algorithm 1 proposes how to build an agreement, but it is probably a higher-level algorithm for Agreement composition rather than Agreement/Policy evaluation.

I've not done that part of the work (it has other moving parts that open pandora's box, so not worth going there). But what I do is build a "tentative final" RDF based on the merge of all the ancestors of a policy (each rule with each required [maybe different] assigner and a selected assignee, which in my case becomes the "main actor" as part of the evaluation requests (it could have multiple assignees/etc, but that's part of pandora's box).

AndreaCimminoArriaga commented 4 days ago

Following this thread, I think I would like to discuss in order to later include in the ODRL standard, either in the formal semantics or in a new working draft document, these elements:

An ODRL implementation, for the moment I will name it ODRL directory, that is software which is able to receive synchronous and/or asynchronous events/requests and trigger the evaluation of certain polices. This ODRL directory will be also in charge of storing, querying and retrieving ODRL policies. The use cases that the directory should support are:
- Create, Read, Update, Delete ODRL policies
- Discovery/Search policies
- Enforce policies (synchronously, asynchronously or both)
A formalization, form the software point of view, of the ODRL policies and a pseudocode that uses such formalization and evaluates policies. Note that the evaluation should be agnostic from synchronously or asynchronously paradigm since the ODRL directory must be the one handling such casuistic.

To showcase my proposal I suggest as starting point the following formalization:

ODRL refinement, ß, is a tuple (D, S_µ) where D is a set of RDF triples that describe the refinement and S_µ is a set of ODRL constraints µ.
RDF excerpt, E_RDF, is defined as either an IRI that identifies a resource in RDF or a set of RDF triples.
ODRL excerpt, E_ODRL, is defined as either an IRI that identifies a resource in RDF or a set of ODRL refinements.
ODRL action, A, is defined as an ODRL excerpt (E_ODRL). If the excerpt is an IRI, it must exist in the ODRL ontology or any extension of it. Instead, if the excerpt is a set of triples it represents an action with at least one refinement.
ODRL assigner, G_er, is defined as an ODRL excerpt (E_ODRL). If the excerpt is an IRI it must identify an actor that participates in an agreement expressed with an ODRL policy. Instead, if the excerpt is a set of triples it represents it represents an assigner with at least one refinement.
ODRL assignee, G_ee, is defined as an ODRL excerpt (E_ODRL). If the excerpt is an IRI it must identify an actor that participates in an agreement expressed with an ODRL policy. Instead, if the excerpt is a set of triples it represents an assignee with at least one refinement.
ODRL resource target, T, is defined as an ODRL excerpt (E_ODRL). If the excerpt is an IRI it must identify a resource that an ODRL policy refers. Instead, if the excerpt is a set of triples it represents such resource with at least one refinement.
ODRL contraint, µ, is a tuple (o, L, R) been o an IRI from the ODRL ontology defined as odrl:Operator or any extension of it, L and R are RDF excerpts (E_RDF). In the case L or R are IRIs, such IRIs must exist in the ODRL ontology and defined as odrl:LeftOperandor odrl:RightOperand, respectively, or any extension of them. In the case L or R are a set of triples in RDF it means there is either an odrl:Profile involved or they codify constant values.
ODRL rule, Ω , is defined as a tuple (G_er, G_ee, T, A, C_µ) been G_er an ODRL assigner, G_ee an ODRL assignee, T an ODRL target resource, A and ODRL action and S_µ a set of constraints.
ODRL Policy, P, is a set of rules S_Ω.

There is a direct transformation from an ODRL policy written in RDF to the aforementioned formalization. I will keep developing and extending the formalization in following posts.

joshcornejo commented 4 days ago

I don't think the part of the directory ecosystem (which most people are now settling to call "a marketplace") works the way you propose. There is a lot of convergence with other standards for assets (DCAT / DPROD), and at least DCAT (which is in version 3) is widely used across the data industry. Almost every implementation today is "asset focused".

A search will start with Assigners (who provide data services) or Assets (the different services provided) AND maybe a defining characteristic of how the services are provided. A general example would look like:

I want to find a list of suppliers (Assigners) that have a catalogue of toys (Assets) that can be used in Germany (a constraint in a rule in a policy).

A directory of policies is (in most cases) useful only to those creating policies, but the Asset dependency (as most of these people are "data product owners", and secondary behave as "policy product owners".

The above proposal you make should encompass the odrl:Party and odrl:Asset as starting points (the policy is secondary).

AndreaCimminoArriaga commented 18 hours ago

My proposal was not to implement a marketplace, which is different from the directory I was mentioning. Also, search will heavily depend on the mechanisms implemented, for which I would suggest to have a filtering language (e.g., JSON Path) and a query language that should be SPARQL.

Regarding A directory of policies is (in most cases) useful only to those creating policies I agree but I will also add that the directory will handle the evaluation of policies (sync. or async.). Therefore it is not a trivial service and they could exist different ones depending on particular scenario. This is why I think is important also to define this kind of service. However, for not mixing is maybe better to follow this part of the conversation in the issue #67.

Finally, regarding The above proposal you make should encompass the odrl:Party and odrl:Asset as starting points (the policy is secondary). if you check the formalization proposed it actually does supports both; unless I missed some element.

joshcornejo commented 18 hours ago

Just 2 comments:

IMHO, a directory is a foundational component of a "marketplace" (it is a repository indexed by relevant terms to the user)
Your formalisation touches on parties and assets but starts from policies, hence why my comment is to "turn it around" to where the process of using a directory starts

AndreaCimminoArriaga commented 18 hours ago

IMHO, a directory is a foundational component of a "marketplace" (it is a repository indexed by relevant terms to the user) It can be a foundational component, of course, but since it also handles other operations different from CRUD and search it nos necessarily needs to go with a marketplace and we can use it separately. Also, marketplace is a strong world that may entail certain specific domains which I do not think we want to narrow down; directory is rather generic (and different) allowing its adoption in more agnostic domain use cases. In any case, we can name it differently if you think the name can lead to misunderstandings. For instance, ODRL service? my proposal is to have a service able to offer certain functionalities.

Your formalisation touches on parties and assets but starts from policies, hence why my comment is to "turn it around" to where the process of using a directory starts IMHO It has to start from policies since ODRL ontology defines policies that have elements...among which parties, assets, etc. If we say that parties have policies, then we are going outside the ODRL concept. In any case, maybe could you jot down a draft of your proposal to discuss over it?

The formalization proposed is meant to define a pseudocode for evaluating/enforcing policies, the directory will be the service that has such implementation and allows, among other functionalities, to evaluate the policies.