microsoft / sarif-tools

A set of Python command line tools for working with SARIF files produced by code analysis tools
MIT License
76 stars 19 forks source link

Implemented general filtering (replaces blame filtering) #28

Closed abyss638 closed 9 months ago

abyss638 commented 9 months ago

BREAKING CHANGE

Fixes #13

Implemented general filtering for any field in a results object. This replaces blame filtering which filtered only by author emails in blame details.

The filter format is switched from plain text to YAML.

Here is an example of a filter file (from updated README.md):

# Lines beginning with # are interpreted as comments and ignored.
# Optional description for the filter.  If no title is specified, the filter file name is used.
description: Example filter from README.md

# Items in `include` list are interpreted as inclusion filtering rules. 
# Items are treated with OR operator, the filtered results includes objects matching any rule.
# Each item can be one rule or a list of rules, in the latter case rules in the list are treated with AND operator - all rules must match.
include:
  # The following line includes issues whose author-mail field contains "@microsoft.com" AND found in Java files. 
  # Values with special characters `\:;_()$%^@,` must be enclosed in quotes (single or double):
  - author-mail: "@microsoft.com"
    locations[*].physicalLocation.artifactLocation.uri: "*.java"
  # Instead of a substring, a regular expression can be used, enclosed in "/" characters.  Issues whose committer-mail field includes a string matching the regular expression are included.  Use ^ and $ to match the whole committer-mail field.
  - committer-mail: "/^<myname.*\\.com>$/"

# Lines under `exclude` are interpreted as exclusion filtering rules.
exclude:
  # The following line excludes issues whose location is in test Java files with names starting with the "Test" prefix.
  - location: "Test*.java"
  # The value for the field can be empty, in this case only existence of the field in 
  - suppression:

Field names must be specified as a JSONPath expression, substrings and Regex are supported as before.

The following shortcuts are supported (from updated README.md): Shortcut Full JSONPath
author properties.blame.author
author-mail properties.blame.author-mail
committer properties.blame.committer
committer-mail properties.blame.committer-mail
location locations[*].physicalLocation.artifactLocation.uri
rule ruleId
suppression suppressions[*].kind

For location which represents a file location wildcards are supported:

Added a Pytest UTs for filtering code.

FilterStats is extracted to a standalone file and simplified (removed few no longer needed counters).

'GeneralFilter' is based on an extracted BlameFilter.

balgillo commented 9 months ago

Looks great, thanks for this!

A few comments to discuss/resolve

balgillo commented 9 months ago

I think we should do something to help ease the transition for users, e.g. write a page about migrating from blame-filter to filter, check the --blame-filter argument and point people to the page. But that might be better as a separate PR to keep this one small.