yardstick should know about quality gates

anchore / yardstick

Compare vulnerability scanners results (to make them better!)

Apache License 2.0

14 stars 4 forks source link

What would you like to be added:

Right now, yardstick can capture the difference between different tool outputs, but it has no notion of a label provider, and tools that compare yardstick results to known labels implement two mechanisms on their own:

A mechanism to fetch some labels to use in comparison. (grype uses git submodules for this mechanism.)
A mechanism to run a quality gate, for example so that too great a deviation from known labels fails a job in CI. Grype's gate.py is an example.

The core request is: a yardstick.yaml file should be able to configure two additional things: where do my labels come from? And what comparison methods and thresholds are considered success vs failure.

One way we've discussed this is: a label source and a comparator should be sort of like "tools" in yardstick's config model.

Why is this needed:

Right now, we're maintaining a fair amount of python code in different repos that fetches labels and runs different comparisons. All of these comparison codes fetch labels and run comparisons in similar, but not identical ways. We should promote this duplicated code into a new configurable section in yardstick.

Additional context:

Currently there are quality gates in grype and vunnel:

grype: https://github.com/anchore/grype/blob/dec563669d683ab4d11e95a28635099673363d80/test/quality/gate.p
vunnel: https://github.com/anchore/vunnel/blob/877469c41e3180ee6bdb03d24dd6debfd43a8a7a/tests/quality/gate.py#L20
https://github.com/anchore/grype-db/blob/main/manager/src/grype_db_manager/db/validation.py functions somewhat like a quality gate

Recently, both gates had a failure mode where they caused unexpected false positives. That is, the gate.py was exiting zero but should not have been. This seems like evidence that the quality gate needs to be better tested, but setting up testing for once-off python scripts in different repos isn't sustainable. The ask here is that the functionality implemented by gate.py in these two repos (and possibly others I missed) should be made into a CLI command in yardstick.

Right now, the comparisons used are different, but this difficulty seems surmountable for 2 reasons: first, some of the differences are accidental, and 2, we could employ a strategy pattern or plugin model for differences that couldn't be refactored out.

anchore / yardstick

yardstick should know about quality gates #126