open-policy-agent / opa

Open Policy Agent (OPA) is an open source, general-purpose policy engine.
https://www.openpolicyagent.org
Apache License 2.0
9.37k stars 1.3k forks source link

annotations: Allow applying schemas to rule/function outputs #5201

Open philipaconrad opened 1 year ago

philipaconrad commented 1 year ago

What is the underlying problem you're trying to solve?

Currently, the schemas/annotations feature in Rego can be applied to document values at package, document, rule, and subpackages scopes.

However, the rule scope currently only applies to input values to a rule-- there is no way to handle enforcing schemas over the outputs of a rule.

Describe the ideal solution

I propose extending the rule scope / annotation system a bit further, to allow defining schema constraints for both:

This would be done by requiring fully-qualified names for each rule-head variable and/or rule name, and would look roughly like the following examples:

Partial Object definition:

package test

# METADATA
# schemas:
#   - test.basic_auth.customer_id: schema.basic_auth.customer_id
#   - test.basic_auth.user_name: schema.basic_auth.user_name
#   - test.basic_auth.password: schema.basic_auth.password
basic_auth[customer_id] := {"user_name": user_name, "password": password} if {
    # ...
}

Rule output value:

package test

# METADATA
# schemas:
#   - test.allow: schema["allow"]
allow {
    # ...
}

Describe a "Good Enough" solution

Supporting schemas on the entire rule's output might be sufficient for many use cases, since it would be checking the final result of the rule at runtime, and as a result is a superset of applying schemas to the rule-head variables.

Alternative Approaches Considered

Several other approaches could be considered for implementing this functionality, but most of them have issues, as detailed below:

Additional Context

Possibly relevant JSON Schema issues:

Less relevant, but maybe related issues:

anderseknert commented 1 year ago

Defining schema for a data.* attribute on the package level, or possible even sub-package level, works pretty well for this purpose today already:

# METADATA
# schemas:
#   - data.policy.decoded: schema.claims
package policy

decoded := json.unmarshal(`{"exp": 1, "nbf: 2"}`)

allow {
    decoded.exz == 1
}

The above will fail with:

❯ opa eval -f pretty --schema schemas -d policy.rego data.policy.allow
1 error occurred: policy.rego:9: rego_type_error: undefined ref: data.policy.decoded.exz
    data.policy.decoded.exz
                        ^
                        have: "exz"
                        want (one of): ["exp" "nbf"]

So far, so good 🙂 I think the problem with the approach above, and one that we'll need to consider for defining schemas for outputs, is that of schema distribution. If the onus of importing schemas is still on the "end user" of a policy and not on the library author, it matters less if the schema is defined on an "output" or an "input" — the effort and outcome will be pretty much the same either way. Sure, I might not have to annotate my rules for type-checking data.foo.bar that I got from some library, but I'd still need to find the schemas for that library out-of-band, and put them in the folder designated for schemas in my project, if such a folder exists. In order to have schemas defined on outputs in library code automatically enhance the type checking experience of consumer policy, we'd need better ways of making schemas discoverable, and ideally automatically loaded (possibly with some way of opting-out).

The distribution/load mechanism allowed today is file system-based only. URL-based distribution as suggested in https://github.com/open-policy-agent/opa/issues/4478 would be interesting with schema-dfined "outputs" in mind, as a library author could simply point out where to find their schemas, and then consumers of that library could have it retrieved automatically.

charlieflowers commented 1 year ago

This would be helpful for us, so I'm eagerly watching this.

We're finding that, in an auth-z context, with developers who are newish to Rego, the risk of coding errors compounded by the potential for being surprised at what happens when undefined is encountered is a significant concern. Obviously an incorrect auth-z decision is a serious problem.

Doubling down on JSON schema is one way to introduce and expand type checking. I'm curious: have other approaches to strong typing been considered? For example, has the possibility of a gradual typing system with first class Rego syntax been considered?

anderseknert commented 1 year ago

I'd say the type system of Rego is pretty strong already, even to the point that people have complained about it 😅 The problem JSON schema solves is really how to type data, whether provided as input or as data from external sources. Perhaps it would be possible to do this in-policy, but since the data is defined outside of policy, it kinda makes sense to me that the type definitions are too. But ideas are certainly welcome 😃

charlieflowers commented 1 year ago

The problem JSON schema solves is really how to type data, whether provided as input or as data from external sources.

Agree. That's the area my comment is focused on. It's the dynamic typing default around input and data that introduces risk of incorrect auth-z decisions.

Perhaps it would be possible to do this in-policy, but since the data is defined outside of policy, it kinda makes sense to me that the type definitions are too.

Here's an argument for doing it in-policy: The policy was written with certain preconditions in mind. So IMHO a contract-oriented mindset makes sense here. It would be very helpful if the policy could be defensive, by saying, "I was written with these assumptions about input and data. You have not adhered to those assumptions. Therefore I refuse to render a decision."

JSON schema plus decision-time schema validation almost gets there. But there's extra work to require that the JSON schema be opinionated enough (because for example a JSON schema could simply say input is an object, and nothing else, which wouldn't apply type checking to enough surface area).

So that line of reasoning leads to the desire to have"strong typing" around all elements of input and data that the policy refers to.

The ultimate goal is soundness around allow decisions in the presence of surprises/errors. Right now, there are many ways for an allow decision to accidentally be true, due to typos in input, unexpected nulls in input (due to mistakes in marshaling and the like), typos in references to data, etc.

anderseknert commented 1 year ago

JSON schema plus decision-time schema validation almost gets there. But there's extra work to require that the JSON schema be opinionated enough (because for example a JSON schema could simply say input is an object, and nothing else, which wouldn't apply type checking to enough surface area).

I'm not sure I understand how this would be any different using gradual typing system expressed in-policy. Why would you not be able to use that to say "input is an object"?

There has been some discussions in the past to allow the use of annotations to loosen or tighten the type checks for whatever code is annotated, so one thing we could do it is allow "in-line" JSON schema definitions, which would probably be a bit closer to what you want.

charlieflowers commented 1 year ago

I'm not sure I understand how this would be any different using gradual typing system expressed in-policy. Why would you not be able to use that to say "input is an object"?

Agreed, both gradual typing and json schema share that characteristic.

Was really just curious about the history of thought on adding type "hints" to Rego.

The main features needed to approach the goal I mentioned (strong typing around input and data) are:

(This has probably strayed far enough from @philipaconrad 's thread that the conversation should move elsewhere ... apologies if I hijacked!)

stale[bot] commented 1 year ago

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days.

stale[bot] commented 1 year ago

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days.