Support code-version-system reporting through SARIF and/or rdjson output formats

github-charles-murphy commented 1 month ago

In order to take advantage of a CVS's ability to highlight issues in their UI (i.e Github review/annotation), producing output in a format that standard CVS reporters (i.e. Reviewdog) can ingest will be very useful.

There are two output formats that are generic for static analysis output which are SARIF and rdjson. SARIF is a good choice for as wide a range of CVS reporters as possible and is not tied to any particular CVS reporter. However, rdjson (specific to Reviewdog) is also very good additional format to support as Reviewdog can produce very detailed reviews/annotations for the main CVS-es from such.

caelean commented 3 weeks ago

Hey @github-charles-murphy! Can you share a little more about how you're using tach right now, and how you'd like to use it with Review Dog? This is my first time hearing about SARIF.

github-charles-murphy commented 3 weeks ago

I'm triggering tach from a GitHub workflow as a triggered (for the moment) workflow. I prefer this to pre-commit because my philosophy is that the PR (to be squashed-merged) is the unit of change (not an individual 'transitory' commit).

Now when I trigger a tool within a workflow which identifies source files and lines with them for critique, I like to see the results in the PR review or check annotation tab. This can be handled by CVS reporters like Reviewdog so long as the output format of the tool matches one of their input formats.

I take it that you already know about Reviewdog and its rdjson input format since you only mentioned not knowing about SARIF.

Static Analysis Result Interchange Format (SARIF) is another standard for representing a critique of source code files and lines within them. Reviewdog accepts SARIF as one of its input formats. SARIF is also produced as an output format by OWASP tools for instance.

In the GitHub actions marketplace, there are GitHub actions that automatically handle SARIF and perform code reviews/annotations (probably using Reviewdog or similar under the hood).

In terms of work for the tach team. The only requirement is to provide a --format option for the output format of tach. This is the common pattern seen with other static analysis tools.

The interface violations found by tach relate to a specific (import) line of the code which can be the target line in the SARIF/rdjson output whilst the reasoning of the violation (i.e. a dunder 'all' line from another location due to the use of strict inn the tach file etc) can just be referenced as part of the diagnostic text in the SARIF/rdjson output.

gauge-sh / tach

Support code-version-system reporting through SARIF and/or rdjson output formats #247