StyraInc / regal

Regal is a linter and language server for Rego, bringing your policy development experience to the next level!
https://docs.styra.com/regal
Apache License 2.0
264 stars 37 forks source link

Redesign handling of aggregate rules in language server #1133

Open anderseknert opened 2 months ago

anderseknert commented 2 months ago

Working on #1131 I hit some severe performance issues around aggregate rule violations, frequently leading to the language server to either hang, or heavily tax the CPU. The reason for this is that aggregate violations currently result in the entire workspace to be linted. Any change later in that package will again lead to the entire workspace to be linted, and so on... instead, I think we should do the following in the context of LSP (and only in that context):

  1. Separate aggregation from linting. I think we do this already internally, but calling linter.Lint() will obviously do both.
  2. On server startup, collect aggregate data from all aggregate rules, and store in the server cache.
  3. On file changes, collect aggregate data only from that file, and replace existing data for that file.
  4. On file changes, request linting of both normal rules and aggregate rules, using the aggregated data as input for the latter.

CC @charlieegan3 for ideas.

charlieegan3 commented 2 months ago

Yeah this sounds like a good plan to me. As I understand it, the aggregates are already set on the report, just never shown. So when we complete linting, we can stash these in the cache pretty easily.

I guess the core part that needs to be adjusted is

    if len(input.FileNames) > 1 {
        aggregateReport, err := l.lintWithRegoAggregateRules(ctx, regoReport.Aggregates, regoReport.IgnoreDirectives)
        if err != nil {
            return report.Report{}, fmt.Errorf("failed to lint using Rego aggregate rules: %w", err)
        }

        finalReport.Violations = append(finalReport.Violations, aggregateReport.Violations...)
    }

Where we might want to have a new field in the input containing all the cached aggregates?