jorritspee / openEHRxNuts

A specification for a distributed openEHR data federation using Nuts v6
0 stars 1 forks source link

Design an acces policy language for openEHR ‘resources’ based on nuts verifiable credentials #11

Open joostholslag opened 4 months ago

joostholslag commented 4 months ago

In order to be flexible in creating many use cases where nuts facilitates trust in sharing openEHR formatted data, it is useful to define a language to describe access policies for openEHR formatted data based on verifiable credentials (VC). A specific policy could be: all doctors (proven by BIG number VC) have acces to any COMPOSITION.ACP for any patient. The generic (non-openEHR specific) requirement is described by @woutslakhorst in the April 2024 community Meetup. https://www.youtube.com/watch?v=TXS0I7D2GeM This language must be human readable and should be computer interpretable. Computer executable is a nice to have. The policies must be implementable by different openEHR platform vendors in a way that systems from different vendors will work together based on a shared understanding of the policies.

One of the key requirements is the language should concisely identify an openEHR ‘resource’ scope. Similar to how HL7 FHIR describes it in smart app launch: https://www.hl7.org/fhir/smart-app-launch/2021May/scopes-and-launch-context.html#clinical-scope-syntax

Possible openEHR resource scopes (compartiment X resource expression X permission) are listed here: https://specifications.openehr.org/releases/ITS-REST/latest/smart_app_launch.html#_scopes_for_openehr_rest_api E.g. ‘patient/template-.r’ : Permission to read a template with 

The other key requirements is a definition of a VC that gives the access to the resource. E.g. “doctor”, “employee relationship with care organisation” “organisational membership of a network”, “ura” “uzi” etc.

In a PDP the following needs to happen:

There are multiple attribute types of trust:

This implies the following checks on the PDP

The question is if the relational check is done before the call is make, checked by the requester, and materialized as proof or that the check should be done by the PDP?

Rego, the language used by Open Policy Agent is a language to describe (AB) access control policy, but it’s generic and it doesn’t natively support identification of openEHR resource scopes nor nuts VCs. Also the language is quite low level (‘if’ statements) so it’s human (other than developers) readability is limited. (Do we need this?). it is interpretable and executable only by (open source) OPA software.

The only known (to this project) alternative is XACML. But apparently it’s dead. And it’s not really human readable. So the rules it out as a candidate.

joostholslag commented 4 months ago

Interesting links: https://nuts-foundation.slack.com/archives/C040Y8JCH4K/p1713528072128529

gasperr commented 4 months ago

OPA is in lots of places noted as the REST version of XACML (which is, in fact, dead and not appropriate for this). But then also you read things like this: https://www.reddit.com/r/kubernetes/comments/xjizg5/opa_rego_is_ridiculously_confusing_best_way_to/ , so it may be good to look elsewhere (Kyverno, jsPolicy, ..) i.e. https://opensource.com/article/23/2/kubernetes-policy-engines

I haven't read more about it so I don't have an opinion that would be backed by any real research yet, but just throwing it out there..

joostholslag commented 4 months ago

There doesn't seem to be a GUI editor for rego. That would have been a nice plus. But there are editors for OpenAPI (https://stoplight.io) and there apparently is a convertor from openapi -> rego. Which could have the benefit of writing policy in a neutral format, which is one of the concerns atm.

But it does feel cumbersome, and it feels like a bad fit to describe policy using a format for api definitions.

joostholslag commented 4 months ago

OPA is in lots of places noted as the REST version of XACML (which is, in fact, dead and not appropriate for this). But then also you read things like this: https://www.reddit.com/r/kubernetes/comments/xjizg5/opa_rego_is_ridiculously_confusing_best_way_to/ , so it may be good to look elsewhere (Kyverno, jsPolicy, ..) i.e. https://opensource.com/article/23/2/kubernetes-policy-engines

I haven't read more about it so I don't have an opinion that would be backed by any real research yet, but just throwing it out there..

So apparently there are alternatives (seems focussed on policy for a kubernetes cluster): I still have the impression OPA is the most well established ecosystem for now. But rego being difficult to learn is a major issue for using it in solving this issue. I'll dive a bit deeper into the alternatives and want to spent some time learning rego using https://academy.styra.com/courses/opa-rego

joostholslag commented 4 months ago

I dove into the alternatives using this piece. But still feel OPA is the main candidate. One differentiator is the language, I do prefer yaml over Rego or JavaScript, because it’s easier to edit, more configuration instead of programming. But it seems kubermetes specific. And JavaScript doesn’t seem to be much better, because it’s even more of a programming language instead of a policy syntax.

sidharthramesh commented 4 months ago

Hey @joostholslag - just to add to this conversation, Rego has been used successfully by Medblocks to implement both SMART on FHIR and SMART on openEHR with HAPI JPA FHIR and EHRbase respectively. Along with the OPA policy engine which just looked at the request information, some degree of changes were needed at the backend to facilitate things like Search result filtering and redaction of content.

Having worked with Rego for the past few years, I can definitely say that it does take some time for new developers to understand it and start working with it owing to its declarative "Datalog" like characteristics which are not very familiar to developers of modern and more "imperative" programming languages like JS.

However, when OPA is used with another more extensible orchestration layer like OPAL (https://docs.opal.ac/), the possibilities of what you can achieve with access control and authorization become endless.

sidharthramesh commented 4 months ago

And just to clarify, Rego policies can interpret JSON and YAML as content. So more complex logic can be written in Rego ONCE, and things that change a lot like Access Control Lists and Who has access to what, user's roles can be controlled via a YAML file, JSON file, or even a Database lookup. This is the most common pattern we use - write complex logic in Rego once, move all the stuff that changes a lot to the data.json file or a database, and have the OPA Engine look things up as an when required.

joostholslag commented 4 months ago

Thanks a lot Sidharth. So the language that defines the attributes and openehr resources can actually be defined in json (schema)? And the logic in REGO can keep consistent over use cases as long as the json schema stays the same. That would be an amazing compromise!

joostholslag commented 4 months ago

@sidharthramesh would you be able to share some REGO policies and JSON files as examples? Is your current approach something we could standardise on?

woutslakhorst commented 4 months ago

And just to clarify, Rego policies can interpret JSON and YAML as content. So more complex logic can be written in Rego ONCE, and things that change a lot like Access Control Lists and Who has access to what, user's roles can be controlled via a YAML file, JSON file, or even a Database lookup. This is the most common pattern we use - write complex logic in Rego once, move all the stuff that changes a lot to the data.json file or a database, and have the OPA Engine look things up as an when required.

We're also looking at it from the Nuts point of view so we can advice on a nginx/OPA/OAS as PEP/PDP/PIP combination. One of the current assumptions is that a lot of the yes/no questions come down to a simple existence question on a triple, quadruple, pentuple, etc):

[scope, role, orgA, orgB, action, resource]: Within SharedCarePlanning, a GP from orgA is allowed to GET /fhir/patient/5 from orgB. The existence of this sextuple is a reflection on the current network for patient 5 at orgB. The PIP could handle this by REST, while rego just needs to check if such an sextuple exist. Static rules on resource role mappings can be defined in rego/json/yaml while runtime information comes from the PIP via Rest.

scope, role, orgA, orgB are returned by token introspection, action, resource are taken from the FHIR http request.

note: legal basis is a separate API call based on the organizations, organization types and the BSN.

sidharthramesh commented 4 months ago

@sidharthramesh would you be able to share some REGO policies and JSON files as examples? Is your current approach something we could standardise on?

@joostholslag we're still debating whether or not to release this code publicly. However, if there are any other teams working on a standardised set of Rego policies on FHIR or openEHR we're more than happy to collaborate.

Thanks a lot Sidharth. So the language that defines the attributes and openehr resources can actually be defined in json (schema)? And the logic in REGO can keep consistent over use cases as long as the json schema stays the same. That would be an amazing compromise!

This is very much possible, and in fact we have all of the HTTP paths and verbs in the configuration JSON and not the Rego rules itself. Makes it much easier for developers to just edit the JSON instead of understanding Rego.

We're also looking at it from the Nuts point of view so we can advice on a nginx/OPA/OAS as PEP/PDP/PIP combination. One of the current assumptions is that a lot of the yes/no questions come down to a simple existence question on a triple, quadruple, pentuple, etc):

@woutslakhorst I don't understand any of the PEP/PDP/PIP abbreviations (forgive my ignorance), but we use OPA as the an External Authorization API for Envoy using the OPA Envoy Plugin . I'm pretty sure Nginx will definitely have a version of this.

However, I want to point out that this alone is not sufficient for our use case because only HTTP request attribute-based authorization was possible - and we were adding the user information into the HTTP Headers as a JWT. In case a user made a request to FHIR endpoint /Patient for example, we'll have to check which patients this user has access to first and then filter out the results based on only what this user has access to. Initially, we were looking at something like https://www.ory.sh/keto/ to be used alongside OPA, but then we decided to go with something much simpler and hacky using OPA, Redis and HAPI FHIR Interceptor afterwards.

joostholslag commented 4 months ago

Some functional requirements inspired by ACP use case recorded here: https://discourse.openehr.org/t/federation-of-persistent-episodic-compositions/5201/9?u=joostholslag

joostholslag commented 4 months ago

Some more requirements (Will translate and integrate with issue text later)

Access Policies definities dienen:

  1. eenduidig en precies te worden vastgelegd
  2. gescheiden te zijn van applicatie(code)
  3. transparant te worden vastgelegd zodat de policies op ieder moment geïnspecteerd kunnen worden (door bevoegde personen)
  4. te allen tijde afgedwongen te worden (dus als policy server down is, werkt de applicatie niet)
  5. bij voorkeur machine readable te zijn
  6. bij voorkeur machine executable te zijn
  7. human readable te zijn
  8. bij voorkeur vastgelegd te worden in een taal die open gespecificeerd is
  9. of in een open source product gedefinieerd zijn
  10. …een onderlegde inhoudelijk expert moet in staat zijn de policies te schrijven
joostholslag commented 2 months ago

@sidharthramesh would you be able to share some REGO policies and JSON files as examples? Is your current approach something we could standardise on?

@joostholslag we're still debating whether or not to release this code publicly. However, if there are any other teams working on a standardised set of Rego policies on FHIR or openEHR we're more than happy to collaborate.

@sidharthramesh: @hkrutzer and I spent a few hours and made a first attempt at defining an access policy for openEHR data in Rego. And adding specifics in a json file as propose above. And we defined a json-schema to validate any json with specific rules is compatible with the rego policy. Still very rough, but it would be interesting to compare the rego policy and the json schema. Files here: https://github.com/joostholslag/openEHRxNuts/pull/3 comments very welcome in this issue or that PR.