FallenAngel97 / tree-sitter-rego

https://decodeapps.pp.ua/blog/post/rego-treesitter
MIT License
11 stars 6 forks source link

Add complete Rego grammar #2

Closed shaded-enmity closed 1 year ago

shaded-enmity commented 2 years ago

Hi @FallenAngel97 I did a clean implementation of the grammar based on https://www.openpolicyagent.org/docs/latest/policy-reference/#grammar and it should be pretty much complete. I still need to add tests and update queries/scms with the new information. However, before I do that I need to talk with OPA/Rego maintainers to verify that some of the changes that I made to the grammar are valid from their point of view (I left my notes in the code), so I'm opening this draft PR first so that I can reference it against the PR that I'm going to open for OPA with the updated grammar..

shaded-enmity commented 2 years ago

Opened open-policy-agent/opa#4955 for reference.

FallenAngel97 commented 2 years ago

Hi @shaded-enmity . Please, configure .editorconfig, because I see there is an indentation issue.

shaded-enmity commented 2 years ago

Hi @shaded-enmity . Please, configure .editorconfig, because I see there is an indentation issue.

Thanks for taking a look! Whitespace issues should be fixed, I'm trying out a new editor (Helix) and still kinda figuring things out. I've also added support for the in operator and parenthesized expressions. I'm now going to figure out how to deploy tree-sitter playground to GH pages.

shaded-enmity commented 2 years ago

Up and running:

https://shaded-enmity.github.io/tree-sitter-rego/

FallenAngel97 commented 2 years ago

Hi @shaded-enmity Thank you for you work. I have checked and it seems, that your repo is correctly setting up the necessary grammar. Can you rebase on the latest master and try to format it with .editorconfig? I see that there is a two spaces and 4 spaces difference. Also - you haven't provided highlights.scm, will you make it as a separate PR?

shaded-enmity commented 2 years ago

Hi @shaded-enmity Thank you for you work. I have checked and it seems, that your repo is correctly setting up the necessary grammar. Can you rebase on the latest master and try to format it with .editorconfig? I see that there is a two spaces and 4 spaces difference.

Will do, there's still some work left to be done in the Rego upstream, but I hope to have that ironed out soon.

Also - you haven't provided highlights.scm, will you make it as a separate PR?

I'd prefer to provide SCMs in this PR - once the grammar is stable I should be able to have those in a couple of hours.

shaded-enmity commented 2 years ago

@FallenAngel97 With the changes to Rego grammar now merged 🎉 and this PR being up to date and rebased, I'm going to start working on the SCMs highlights and queries. I still may push an update to the grammar to use fields in certain cases to make building the queries easier but that's mostly cosmetic I think.

shaded-enmity commented 2 years ago

Updated with latest bits of the grammar + highlights. The work is almost done, but I still need to resolve an ambiguity in parsing rule_head_comp and rule_body. I'm going to open discussion in the tree-sitter repository and link this PR there.

Discussion link: https://github.com/tree-sitter/tree-sitter/discussions/1877

krzykli commented 2 years ago

Hey folks, how is it going? :) Tried out the master recently and unfortunately it is not ready for use yet, but this PR seems to fix many things so excited for it to be merged 👍

@FallenAngel97 @shaded-enmity would any of you mind adding some dev loop notes to README.md? Some docs to ease people into the project would be much appreciated. I'd like to help out when I find a bit of time.

shaded-enmity commented 2 years ago

Finally got around to work on this a little bit more and I think it's good to go.

@krzykli @FallenAngel97 Can you please try the latest version and let me know of any issues you may find?

krzykli commented 1 year ago

@shaded-enmity can you give me some steps to test it? I tried in neovim with this config

local parser_config = require "nvim-treesitter.parsers".get_parser_configs()
parser_config.rego = {
  install_info = {
    url = "https://github.com/shaded-enmity/tree-sitter-rego", -- local path or git repo
    files = {"src/parser.c"},
    -- optional entries:
    branch = "complete-ebnf-grammar", -- default branch in case of git repo if different from master
    generate_requires_npm = false, -- if stand-alone parser without npm dependencies
    requires_generate_from_grammar = false, -- if folder contains pre-generated src/parser.c
  },
}

with that even a simple

default hello := false

hello if input.message == "world"

results in an error in TSPlayground

operator_check [0, 0] - [0, 22]
  identifier [0, 0] - [0, 7]
  ERROR [0, 8] - [0, 13]
    identifier [0, 8] - [0, 13]
  operator [0, 14] - [0, 16]
  false [0, 17] - [0, 22]
operator_check [2, 0] - [2, 8]
  identifier [2, 0] - [2, 5]
  operator [2, 5] - [2, 5]
  identifier [2, 6] - [2, 8]
operator_check [2, 9] - [2, 33]
  identifier [2, 9] - [2, 22]
  operator [2, 23] - [2, 25]
  string_definition [2, 26] - [2, 33]

I must be doing something wrong. Would you mind pointing me in the right direction? :)

shaded-enmity commented 1 year ago

@krzykli I'm not familiar with neovim but the TSPlayground log looks like it's using the original grammar, not this branch. Are you sure the grammar is built & installed correctly?

krzykli commented 1 year ago

@shaded-enmity I pointed it to my local build instead. Getting this result now

(source_file [0, 0] - [4, 0]
  (module [0, 0] - [3, 33]
    (package [0, 0] - [0, 7])
    (ref [0, 8] - [0, 17]
      (var [0, 8] - [0, 17]))
    (policy [1, 0] - [3, 33]
      (rule [1, 0] - [1, 22]
        (default [1, 0] - [1, 7])
        (rule_head [1, 8] - [1, 13]
          (var [1, 8] - [1, 13]))
        (rule_body [1, 14] - [1, 22]
          (assignment [1, 14] - [1, 16])
          (term [1, 17] - [1, 22]
            (scalar [1, 17] - [1, 22]
              (boolean [1, 17] - [1, 22])))))
      (rule [3, 0] - [3, 33]
        (rule_head [3, 0] - [3, 8]
          (var [3, 0] - [3, 5])
          (if [3, 6] - [3, 8]))
        (rule_body [3, 9] - [3, 33]
          (literal [3, 9] - [3, 33]
            (expr [3, 9] - [3, 33]
              (expr_infix [3, 9] - [3, 33]
                (expr [3, 9] - [3, 22]
                  (term [3, 9] - [3, 22]
                    (ref [3, 9] - [3, 22]
                      (var [3, 9] - [3, 14])
                      (ref_arg [3, 14] - [3, 22]
                        (ref_arg_dot [3, 14] - [3, 22]
                          (var [3, 15] - [3, 22]))))))
                (infix_operator [3, 23] - [3, 25]
                  (bool_operator [3, 23] - [3, 25]))
                (expr [3, 26] - [3, 33]
                  (term [3, 26] - [3, 33]
                    (scalar [3, 26] - [3, 33]
                      (string [3, 26] - [3, 33]))))))))))))

for

package something
default hello := false

hello if input.message == "world" 

which looks much better. It does still cause problems in neovim, so I'll be looking into that.

krzykli commented 1 year ago

@shaded-enmity @FallenAngel97 , some quick questions

Thanks :)

FallenAngel97 commented 1 year ago

@shaded-enmity I see the following in case of two rules nearby - it saying missing in case two rules go nearby

first_rule {
  allow
} {
  1 == 1
}
image

But I think, that your progress is amazing. I will try to fix this case by myself - then it should be fine

shaded-enmity commented 1 year ago

@FallenAngel97 Good point, I just noticed that rule is defined as [ "default" ] rule-head { rule-body } meaning that we need to use repeat1($.rule_body) in rule's definition to cover the use case you mentioned and there's a similar use case in the official documentation for else:

authorize := "allow" if {
    input.user == "superuser"           # allow 'superuser' to perform any operation.
} else := "deny" if {
    input.path[0] == "admin"            # disallow 'admin' operations...
    input.source_network == "external"  # from external networks.
} # ... more rules

where authorize := should be (I think) the rule_head with 2 rule_body, would be interesting to look at how the AST of Rego itself looks like in these cases.

FallenAngel97 commented 1 year ago

@shaded-enmity I added that to the master branch a bit earlier, so the basics should be covered. Also, there is constantly deployed job for the live check of TreeSitter parser here: https://fallenangel97.github.io/tree-sitter-rego/

I would really like to add the possibility to create a small environment for each PR, which can be created on this branch, so it can improve feedback