StyraInc / regal

Regal is a linter and language server for Rego, bringing your policy development experience to the next level!
https://docs.styra.com/regal
Apache License 2.0
252 stars 34 forks source link

regal.ast.name replaced by regal.ast.ref_to_string #1072

Closed drewcorlin1 closed 2 weeks ago

drewcorlin1 commented 2 weeks ago

I recently upgraded from regal 0.22.0 to 0.26.1 and my custom rule that was using regal.ast.name started throwing the following error

regal lint services/ policies/
error(s) encountered while linting: failed to lint using Rego rules: failed preparing query for linting: 1 error occurred: /<my-dir>/.regal/rules/custom/my_custom_rule.rego:20: rego_type_error: undefined function data.regal.ast.name

I had to replace this with regal.ast.ref_to_string(rule.head.ref)

Saw this was changed in this commit but not called out in any release notes that I saw.

Just creating this issue in the hopes that it saves someone else some time.

Before

import data.regal.ast

report contains violation if {
    some rule in input.rules
        ast.name(rule) in {"my-name"}
}

After

import data.regal.ast

report contains violation if {
    some rule in input.rules
        ast.ref_to_string(rule.head.ref) in {"my-name"}
}
anderseknert commented 2 weeks ago

Hi there!

And thanks for filling this. I was just thinking about the many number of changes we’ve done to the Rego library recently, including even the AST JSON format(!), and wondering what the impact might be. While I do think we need that flexibility still for a while, we could certainly list any changes to the Rego lib in the release notes. We’ll do our best to do that from now on!

drewcorlin1 commented 2 weeks ago

No problem and wanting the flexibility makes total sense. I also had a moment of "why was I using an undocumented function" but the docs do mention to read the code so calling out breaking changes in releaes notes could be useful.

Thanks!

anderseknert commented 1 week ago

No, you're absolutely right! The Rego library is pretty well-documented in code, but not so much when it comes to how / if one should use it. I've updated the docs on custom rules today to better reflect current Regal: https://github.com/StyraInc/regal/pull/1079

Also mentioned what to expect about the Rego library right now. The ambition is definitely to get that stable sometime soon, and to provide proper API docs for custom rule authors. In the meantime, I'd love to learn more about what type of custom rules you have, and if/how we can support you with that.

drewcorlin1 commented 1 week ago

Thanks! For a bit of context on how I'm using it. We use OPA as an external authorizer for Istio. Our typical authz rules look something like this

# Allow users to get their own accounts
allow {
    api_version == "1"
    method == "GET"
    path = ["accounts", account_id] # account_id taken from JWT
    groups["members"]
}

The api_version part of the rule was something we added and easy to miss so we added a custom rule that ensures everyone is including a check for that in their rule.

I could see us adding another couple rules to ensure that we always check the following as a baseline

My custom regal rule looks like the following

# METADATA
# description: Use api_version in rules
# schemas:
# - input: schema.regal.ast
package custom.regal.rules.custom["use-api-version"]

import rego.v1

import data.regal.ast
import data.regal.result

# METADATA
# description: Allow access to a URI if the service rules permit
# entrypoint: true
report contains violation if {
    ast.package_name == "service.authz"

    some rule in input.rules

    ast.ref_to_string(rule.head.ref) == "allow"

    not has_api_version_term(rule.body)

    violation := result.fail(rego.metadata.chain(), object.union(
        result.location(rule),
        {"description": "allow rules should be written using data.http_request.api_version"},
    ))
}

has_api_version_term(statements) if has_api_version_strict_equal_term(statements)

has_api_version_term(statements) if has_api_version_in_set_term(statements)

valid_api_version(version) if {
    regex.match(`^[\d]+$`, version)
}

api_version_imported if {
    # import is a reserved keyword
    some import_statement in input.imports
    import_statement.path.type == "ref"

    count(import_statement.path.value) == 3

    import_statement.path.value[0].type == "var"
    import_statement.path.value[0].value == "data"

    import_statement.path.value[1].type == "string"
    import_statement.path.value[1].value == "http_request"

    import_statement.path.value[2].type == "string"
    import_statement.path.value[2].value == "api_version"
}

valid_api_version_reference(term) if {
    term.type == "var"
    term.value == "api_version"

    # regal ignore:external-reference
    api_version_imported
}

valid_api_version_reference(term) if {
    term.type == "ref"
    count(term.value) == 3
    term.value[0].type == "var"
    term.value[0].value == "data"

    term.value[1].type == "string"
    term.value[1].value == "http_request"

    term.value[2].type == "string"
    term.value[2].value == "api_version"
}

# Tests for api_version == "1" (or any number)
has_api_version_strict_equal_term(statements) if {
    some statement in statements
    count(statement.terms) == 3

    statement.terms[0].type == "ref"
    count(statement.terms[0].value) == 1
    statement.terms[0].value[0].type == "var"
    statement.terms[0].value[0].value == "equal"

    valid_api_version_reference(statement.terms[1])

    statement.terms[2].type == "string"

    valid_api_version(statement.terms[2].value)
}

# Tests for api_version in {"1", "2"} (or any set of numbers)
has_api_version_in_set_term(statements) if {
    some statement in statements
    count(statement.terms) == 3

    statement.terms[0].type == "ref"
    count(statement.terms[0].value) == 2

    statement.terms[0].value[0].type == "var"
    statement.terms[0].value[0].value == "internal"

    statement.terms[0].value[1].type == "string"

    # this value seems to just be named member_2?
    statement.terms[0].value[1].value == "member_2"

    valid_api_version_reference(statement.terms[1])

    statement.terms[2].type == "set"
    count(statement.terms[2].value) > 1
    not any_api_version_invalid(statement.terms[2].value)
}

any_api_version_invalid(api_version_values) if {
    some api_version_value in api_version_values

    not valid_api_version(api_version_value.value)
}

All to say, this rule is in the interest of providing guardrails to most developers who are authorizing new endpoints so that they don't miss something that they should be checking 99% of the time.

This rule was very much me hacking things together so I'd be curious to hear if there are simpler ways to get done what I am going for, and also if things like assuming a node in the AST is always called "member_2" is a totally insane thing to do

anderseknert commented 1 week ago

Awesome! And that's a perfect use case for Regal 👍

Code looks good too. Some suggestions, if you're interested :)

Although I guess I shouldn't recommend our unstable API 😆 I don't see these functions as likely to change though...

api_version_imported if {
    # import is a reserved keyword
    some import_statement in input.imports
    import_statement.path.type == "ref"

    count(import_statement.path.value) == 3

    import_statement.path.value[0].type == "var"
    import_statement.path.value[0].value == "data"

    import_statement.path.value[1].type == "string"
    import_statement.path.value[1].value == "http_request"

    import_statement.path.value[2].type == "string"
    import_statement.path.value[2].value == "api_version"
}

Could be simplified to:

api_version if ast.imports_has_path(input.imports, ["data", "http_request", "api_version"])
valid_api_version_reference(term) if {
    term.type == "ref"
    count(term.value) == 3
    term.value[0].type == "var"
    term.value[0].value == "data"

    term.value[1].type == "string"
    term.value[1].value == "http_request"

    term.value[2].type == "string"
    term.value[2].value == "api_version"
}

Could use the beforementioned ast.ref_to_string here:

valid_api_version_reference(term) if {
    term.type == "ref"
    ast.ref_to_string(term.value) == "data.http_request.api_version"
}

This double negation..

not any_api_version_invalid(statement.terms[2].value)

..could probably be replaced by an every:

every item in statement.terms[2].value {
    valid_api_version(item.value)
}

About these...

# regal ignore:external-reference

Yeah, we have plenty of those ourselves in the Regal codebase. Next version we'll make this rule a little less annoying :) https://github.com/StyraInc/regal/issues/1002#issuecomment-2332642765

If you're not already there, join the #regal channel in our community Slack! Great to have some place other than issues to talk.

drewcorlin1 commented 1 week ago

thank for for the feedback! very happy to replace what I have with a simpler + more concise alternative

anderseknert commented 1 week ago

👍 I saw now that I forgot to answer this

This rule was very much me hacking things together so I'd be curious to hear if there are simpler ways to get done what I am going for, and also if things like assuming a node in the AST is always called "member_2" is a totally insane thing to do

That is a totally sane thing to do :) In the AST, there are no "operators" as they all get rewritten to built-in function calls. So "foo" in ["bar"] gets rewritten to internal.member2("foo", ["bar"]), 1 + 2 to plus(1, 2), and so on ... see the builtins file in the OPA repo for the full list (search for "infix" to see what all operators translate to).