DataDog / datadog-static-analyzer

Datadog Static Analyzer
https://docs.datadoghq.com/static_analysis/
Apache License 2.0
100 stars 12 forks source link

[STAL-1489] refactor: handle internal rule conversion errors more pragmatically #498

Closed amaanq closed 1 month ago

amaanq commented 1 month ago

What problem are you trying to solve?

Currently, our logic for parsing rules into the internal rule representation is a bit brittle when it comes to processing errors. The reason for this is we convert all errors into anyhow errors; using anyhow means we cannot easily pattern-match or check what kind of error was actually returned. The function Rule::to_rule_internal can return 5 different types of errors:

Although some of these errors can be "grouped" together w.r.t. surfacing a public-facing error, it would not make sense to say, show "error-decoding-base64" when the tree-sitter query failed to build.

What is your solution?

I've refactored the code pertaining to converting a rule to its internal representation by creating an enum that derives thiserror::Error, and has 5 members that each correspond to one of the potential errors mentioned above.

By doing this, we can handle errors more gracefully when we call Rule::to_rule_internal in process_analysis_request, because we can pattern match on the error and return more appropriate user-facing errors depending on the kind of error we encountered. In this specific case, I've grouped the "invalid base 64" and "invalid utf8" errors together, since that indicates a problem with the base64 string, and I've grouped the "invalid rule type", "missing tree-sitter query", and "invalid tree-sitter query" together, since they pertain to parsing and actually building the rule. A new constant has been added to represent this kind of error - ERROR_PARSING_RULE which has the string "error-parsing-rule", and is what will be shown to users.

Alternatives considered

What the reviewer should know

This PR touches a bit on the code in the tree_sitter mod because it has to return the tree-sitter query error itself rather than an anyhow error, so that it can be mapped to the RuleInternalError at the call-site in to_rule_internal.