Understanding how/why a decision was made using metadata in decision logs

gshively11 commented 4 years ago

This is related to the following issue (I'm happy to close this issue as a duplicate if you would prefer that):

https://github.com/open-policy-agent/opa/issues/2089

We have the following needs:

Understand how/why a decision was made using decision logs, for auditing and debugging purposes.
Keep our policies as simple as possible.
Mask sensitive input properties in our decision logs.

Consider the following policy:

jwt = t { 
  [_, payload, _] = io.jwt.decode(sso_jwt)
  t := {
    # Custom built-in function to verify jwt signature
    "valid": custom_verify_jwt(key_host, sso_jwt), 
    "payload": payload
  }
}

employee = e {
  jwt.valid
  jwt_has_expected_prop_1
  jwt_has_expected_prop_2
  e := jwt.payload
}
employee = e {
  jwt.valid
  jwt_has_expected_prop_2
  jwt_has_expected_prop_3
  e := jwt.payload.sub_prop
}
two_factors {
  count(employee.factors) >= 2
}
jwt_is_fresh {
  (round(time.now_ns() / 1e9) - jwt.payload.iat) < max_ttl
}

can_do_thing {
  employee
  two_factors
  jwt_is_fresh
}

Our apps will hit /v1/data/can_do_thing and receive a true/false authorization response. In our decision logs, we want to be able to understand WHY a decision was made. Which rules inside can_do_thing passed/failed, and in turn, which child rules under those passed/failed. We can't re-run the policy at a later date from the decision log data, because it's masked and time-based.

I don't have any immediate thoughts about what this could/should look like in the decision logs, but I wanted to at least start a conversation about this feature. Is it feasible/desirable in OPA core?

anderseknert commented 4 years ago

While it's not fully what you're asking for I think the common approach to this is to include a "reason" (or "message", etc) attribute in the response, which of course will be included in the decision log as well. So you'd do something like:

reasons[reason] {
    not two_factors
    reason := "MFA authentication required"
}

and then include the reason(s) as part of your response/decision. If you don't want the client to know why they were denied I guess you could do something like provide an error code or some other opaque value known to you, but in most setups I think the reason provides an extra value to the client as it can also use it as part of its enforcement.

gshively11 commented 4 years ago

Thanks for the suggestion @anderseknert. Following from my example above, this is the pattern I've landed on for the moment:

# does input.sso_jwt contain a syntactically valid jwt
default jwt_exists = false
jwt_exists {
  count(jwt_exists_a) > 0
}
jwt_exists_a[r] {
  regex.match(`^[A-Za-z0-9-_=]+\.[A-Za-z0-9-_=]+\.?[A-Za-z0-9-_.+/=]*$`, sso_jwt)
  r := "input.sso_jwt contains a syntactically valid jwt"
}
jwt_exists_d[r] {
  not jwt_exists
  r := "input.sso_jwt did not contain a syntactically valid jwt"
}

# Decode the jwt contained in input.sso_jwt
default jwt = null
jwt = t { 
  jwt_exists
  [_, payload, _] = io.jwt.decode(sso_jwt)
  t := {
    "payload": payload
  }
}

# check if the jwt is signed
default jwt_is_signed = false
jwt_is_signed {
  count(jwt_is_signed_a) > 0
}
jwt_is_signed_a[r] {
  jwt_exists
  custom_builtin_verify_jwt(api_host, sso_jwt)
  r := {"input.sso_jwt is signed"} | jwt_exists_a
}
jwt_is_signed_d[r] {
  not jwt_is_signed
  r := {"input.sso_jwt is not signed"} | jwt_exists_d
}

# get employee information in the jwt payload
default employee = null
employee = e {
  count(employee_a) > 0
  e := {
    "username": jwt.payload.username
  }
}
employee_a[r] {
  jwt_is_signed
  jwt_has_expected_prop_2
  jwt_has_expected_prop_3
  r := {"jwt represents an employee"} | jwt_is_signed_a | jwt_has_expected_prop_2_a | jwt_has_expected_prop_3_a
}
employee_d[r] {
  is_null(employee)
  r := {"jwt does not represent an employee"} | jwt_is_signed_d | jwt_has_expected_prop_2_d | jwt_has_expected_prop_3_d
}

# check if the jwt has not exceeded its ttl
default jwt_is_fresh = false
jwt_is_fresh {
  count(jwt_is_fresh_a) > 0
}
jwt_is_fresh_a[r] {
  (round(time.now_ns() / 1e9) - jwt.payload.iat) < max_ttl
  r := "jwt is fresh"
}
jwt_is_fresh_d[r] {
  not jwt_is_fresh
  r := "jwt is not fresh"
}

# can we do a thing
default can_do_thing = false
can_do_thing {
  count(can_do_thing_a) > 0
}
can_do_thing_a[r] {
  not is_null(employee)
  jwt_is_fresh
  r := employee_a | jwt_is_fresh_a
}
can_do_thing_d[r] {
  not can_do_thing
  r := employee_d | jwt_is_fresh_d
}

It's super verbose and makes it a lot harder to understand the rego...but it allows me to pass all the nested reasons why can_do_thing passes or fails. It would be ideal if I could write the same thing above like this...

default jwt_exists = false
jwt_exists {
  regex.match(`^[A-Za-z0-9-_=]+\.[A-Za-z0-9-_=]+\.?[A-Za-z0-9-_.+/=]*$`, sso_jwt) # input.sso_jwt is syntactically valid
}

# Decode the jwt contained in input.sso_jwt
default jwt = null
jwt = t { 
  jwt_exists
  [_, payload, _] = io.jwt.decode(sso_jwt)
  t := {
    "payload": payload
  }
}

default jwt_is_signed = false
jwt_is_signed {
  jwt_exists
  custom_builtin_verify_jwt(api_host, sso_jwt) # jwt is signed
}

default employee = null
employee = e {
  jwt_is_signed
  jwt_has_expected_prop_2
  jwt_has_expected_prop_3
  e := {
    "username": jwt.payload.username
  }
}

default jwt_is_fresh = false
jwt_is_fresh {
  (round(time.now_ns() / 1e9) - jwt.payload.iat) < max_ttl  # jwt is less than 12 hours old
}

default can_do_thing = false
can_do_thing {
  not is_null(employee) 
  jwt_is_fresh
}

...and then have something read in the AST and then output a new AST that accomplishes the output of the more verbose one.

anderseknert commented 4 years ago

@gshively11 I think you're spot on that JWT validation with provided reasons is super verbose. In fact, we do something very similar to the policy you posted above in a library shared between our services. While not super elegant the "end user" (i.e the application developer) only needs to do something like:

package rules

import data.common.jwt

deny[reason] {
  reason := jwt.deny[_]
}

..and then add their own deny rules based on app specific logic. So while not terribly elegant in the library we try our best to encapsulate that complexity. There's just so many reasons JWT validation might fail so if you want to cover all cases and report back with a reason it pretty much will be verbose.

stale[bot] commented 3 years ago

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days.

open-policy-agent / opa

Understanding how/why a decision was made using metadata in decision logs #2755