Open anderseknert opened 2 hours ago
Thinking about it more, I guess we could even provide a way for the aggregate_report
rules to be provided the data in the same format as they're currently served. This would be less efficient, but could be an option to avoid breaking existing policies. The question would be how to best opt in/out of that.. 🤔 or if it's worth maintaining..
Regal evaluates aggregate rules in a 2-step process:
aggregate_report
rules are queried with this data as input.Each file evaluated by the aggregate rule will result in a data structure like this returned:
Each input file will then append a similar object to the array of any given rule. This data is verbose, and while it's easily readable, that's not a huge win here as this isn't meant to be consumed by humans in the first place. Running
regal lint
on its own bundle results in almost one megabyte of this data being sent back and forth, and we don't even have many aggregate rules! As recent work has shown, lots of data equals lots of allocations. Sometimes that's unavoidable, but this is one place where we could be optimizing more.To make this more efficient, we should look into:
rule
attribute from the payload. The outer map is already keyed by category/rule and repeating it in the payload is just a waste of resources.aggregate_data
list for its file rather than the top level one.As aggregate rules are also available to custom rule authors, this would be a breaking change, and one we'll need to document well how to go from the old format to the new, both on the aggregating side (where this would likely not require a change as long as they are using
result.aggregate
and we update that) and on the reporting side (where policies would need to be updated to handled the new format).While we are at it, let's make sure we also handle empty aggregate results correctly and efficiently for both built-in and custom rules.