open-policy-agent / opa

Open Policy Agent (OPA) is an open source, general-purpose policy engine.
https://www.openpolicyagent.org
Apache License 2.0
9.55k stars 1.32k forks source link

Allow parameterized tests #2176

Open anderseknert opened 4 years ago

anderseknert commented 4 years ago

It would be very useful if the unit test system allowed for some type of simple "data driven" parameterization of tests where one of the mocked inputs could be parameterized to skip needless verbosity and possible copy-paste errors.

Expected Behavior

test_method_not_allowed[input.method] {
    not allow with input.method as ["POST", "DELETE", "PUT", "PATCH"]
}

(obviously made up example and syntax)

Actual Behavior

test_post_not_allowed {
    not allow with input.method as "POST"
}

test_delete_not_allowed {
    not allow with input.method as "DELETE"
}

test_put_not_allowed {
    not allow with input.method as "PUT"
}

test_patch_not_allowed {
    not allow with input.method as "PATCH"
}
tsandall commented 4 years ago

You could do this with a helper rule or comprehension today. For example:

methods := {"POST", "DELETE", "PUT", "PATCH"}

test_method_not_allowed {
  disallowed_methods == methods
}

disallowed_methods[m] {
  some m
  methods[m]
  not allow with input.method as m
}

It would be nice to know which methods failed. If the test framework was a bit smarter it could report diffs on the variables inside the test rules. For example:

data.example.test_method_not_allowed: FAIL (1ms)
   {"DELETE", "PUT"} == {"POST", "DELETE", "PUT", "PATCH"}
   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

It's important to use a helper rule or comprehension though--this would not be correct:

# THIS IS NOT CORRECT. The test will pass if SOME `m` satisfies the query (not EVERY `m`)
test_method_not_allowed {
  some m
  methods[m]
  not allow with input.method as m
}

Once we have the every keyword (#2090) we could avoid the helper/comprehension.

anderseknert commented 4 years ago

Better visualization of diffs in assertions would indeed be great, and perhaps a feature request of its own :)

A notable difference between that and parameterization though - in your example you feed data into a single test and iterate over that within the test, whereas in "data driven" parameterized testing the data builds the tests, so given the example (and some way of annotating or marking a test as parameterized - in my made up example it was [input.method] after the test_ rule to not stray too far from established syntax) a parameterized test of 4 methods would really expand into 4 distinct tests and be shown as such also on success. This has the added benefit of knowing what is tested without 1) having a test failure and 2) reading the actual rego tests to see what is asserted within the test cases. So to carry on the example from before, given:

test_method_not_allowed[input.method] {
    not allow with input.method as ["POST", "DELETE", "PUT", "PATCH"]
}

Running opa test -v . would give something like:

data.http.test_method_not_allowed["POST"]: PASS (686.3µs)
data.http.test_method_not_allowed["DELETE"]: PASS (467.1µs)
data.http.test_method_not_allowed["PUT"]: PASS (322.8µs)
data.http.test_method_not_allowed["PATCH"]: PASS (219.3µs)
--------------------------------------------------------------------------------
PASS: 4/4
anderseknert commented 4 years ago

Another case, albeit quite a different one, I've been looking at recently is piping kubernetes resource lists into conftest and the like (kubectl get deployments -o yaml | conftest test -) with a corresponding policy like the below:

deny[reason] {
    input.kind == "List"
    item := input.items[_]
    not item.metadata.labels.app
    reason := sprintf("%v does not have an app label", [item.metadata.name])
}

This works in that it will report all failures, but since it's from a single rule it will have no notion of tests that passed:

305 tests, 0 passed, 0 warnings, 305 failures, 0 exceptions

One alternative is of course to call conftest/OPA individually per object, but then that's gonna miss out on the total.. plus of course the overhead of starting those application hundreds of times.

If we could parameterize part of the input to be part of the reported rule name, each test could be counted. I should note though that this use case is still more in the "thinking out loud" department than the previous one.

stale[bot] commented 2 years ago

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days.

anderseknert commented 2 years ago

Circling back to this as I recently had a need for something like this again. BTW, the conftest issue mentioned above has been solved in conftest, and what I wanted there is now possible.

I think that a simple extension to the test framework allowing partial rules to be used would be the most elegant, Rego native, way of dealing with this:

test_method_not_allowed[method] {
    some method in ["POST", "DELETE", "PUT", "PATCH"]
    not allow with input.method as method
}

All that the test runner would need to keep track of is the method reference, and include that in the report on failures:

❯ opa test .
data.test.test_method_not_allowed["DELETE"]: FAIL (58.125µs)
--------------------------------------------------------------------------------
PASS: 3/4
FAIL: 1/4

As this is already how OPA deals with partial rules, it seems like it would be quite a simple addition 🤔

stale[bot] commented 2 years ago

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days.

c-thiel commented 2 years ago

Just if anybody else is running across this issue: This is now possible using the every keyword: https://www.openpolicyagent.org/docs/latest/policy-language/#every-keyword

anderseknert commented 2 years ago

While every is great, it does not help with this issue. Doing iteration over a list of values was possible already before every, but won't help with the rest of the details described here.

tsandall commented 2 years ago

I think every could help a bit... for example:

test_method_not_supported {                     # test_method_not_supported is true if...
  every method in {"PUT", "PATCH", "DELETE"} {  #   for every method in ...
    not allow with input.method as method       #     not allow with ... is true
  }
}

If the condition attached to the every statement was false for ANY method, the test would fail--which would be identical to how it works in any other rule.

As far as providing better assertion failure reporting goes... I wonder how far we could get by plugging all vars on the failed expression and then eliding elements in large collections... this approach would be nice regardless of whether every is used... we should do a survey of real-world unit tests and see how far we'd get with substituting vars...

data.test.test_method_not_allowed: FAIL
  x.rego:5: not allow with input.method as "PATCH"
  x.rego:5: not allow with input.method as "DELETE"
anderseknert commented 2 years ago

If the condition attached to the every statement was false for ANY method, the test would fail--which would be identical to how it works in any other rule.

Yeah - we shouldn't exit early in that case though as we'd want to know if there were more than one assertion that failed. But maybe we don't do that in tests.

The assertion failure reporting would be an improvement for sure!

c-thiel commented 2 years ago

@anderseknert you are right, that part is still open.

I just think that https://github.com/open-policy-agent/opa/issues/2176#issuecomment-1072848196 is already a huge improvement over https://github.com/open-policy-agent/opa/issues/2176#issuecomment-599539791.

anderseknert commented 2 years ago

Right - it's a cosmetic improvement for sure, but I'd argue that the part that is "still open" is the actual parameterization.

In some ways, every even makes this somewhat "worse", as the early exit optimization will ensure that evaluation will halt at the first failed condition, and there's no way to retrieve all the conditions that failed. If you'd want something like that you'd currently need to use a comprehension, and compare the result to the expected outcome.

EDIT - I just had another look at the original suggestion, and that too would be subject to early exit, so no difference in that vs. using every.

Using partial rules (as suggested before) for this would avoid early exit, and feels like an "OPA native" way of doing it. The problem with that approach is that we currently lack a way of differentiating success vs failures in a single "partial test", as we'd need a way to keep track of what's being iterated over, and count both failure conditions and successes.

With better assertion reporting, and disabling of early exit in tests, we'd get somewhere close to that using every as well. Might be confusing to have one behavior for tests vs. other policy though wrt early exit, but oh well, it's at least an alternative to consider :)

stale[bot] commented 2 years ago

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days.

marcaurele commented 10 months ago

I double on what @anderseknert says about the parametrized tests, the early exit and the output of the param used for each run is indeed important to have that feature done. IMHO the every keyword is only a work around for now.

marcaurele commented 10 months ago

I described on https://github.com/orgs/open-policy-agent/discussions/514 our current issue trying to find a way to test all the rules possibilities to ensure business coverage of them. Having this parametrized feature was exactly what I was looking for.

marcaurele commented 10 months ago

@tsandall / @c-thiel there is a major problem using the every keyword in a single test. I have a test case with ~300 entries to go through the list iterated by the every keyword, making the test go easily over the timeout of 5s for the execution. This means I have to customize the timeout value, and increase it the more we add entries to that long list. It's problematic to keep the test suite healthy without extra maintenance.

SUMMARY
--------------------------------------------------------------------------------
/test/mab_test.rego:
data.authorization_test.test.test_mab_permitted: ERROR (5.001104217s)
  eval_cancel_error: caller cancelled query execution
data.authorization_test.test.test_mab_not_permitted: PASS (78.127613ms)
anderseknert commented 10 months ago

That seems like a bug @marcaurele 😅 could you provide a snippet showing what that code looks like? (no need to provide all 300 cases, just one or two, and the logic where you run every)

marcaurele commented 10 months ago

@anderseknert I opened a new bug in order to not mix it up with this feature request: https://github.com/open-policy-agent/opa/issues/6400

stale[bot] commented 9 months ago

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days. Although currently inactive, the issue could still be considered and actively worked on in the future. More details about the use-case this issue attempts to address, the value provided by completing it or possible solutions to resolve it would help to prioritize the issue.