anchore / vunnel

Tool for collecting vulnerability data from various sources (used to build the grype database)
Apache License 2.0
68 stars 25 forks source link

Improve error handling of deterministic minor errors #544

Open willmurphyscode opened 4 months ago

willmurphyscode commented 4 months ago

Background:

Every now and then we see 403s from ALAS issues (e.g. as of this writing, https://alas.aws.amazon.com/AL2/ALAS-2024-2510.html returns 403).

Right now, this causes the entire operation of vunnel run -p amazon to exit non-zero, which might not be the behavior we want. Concretely, the first exception raised during the provider run halts the execution. HTTP GETs are retried 5 times, but this 403 is deterministic, so the retries don't help.

What would you like to be added:

We should be able to configure some continue-on-error semantics for vunnel; right now it's too all-or-nothing. For example, I should be able to write down, "provider X claims that there's a vulnerability we should download from example.com/some-cve, which is unreachable. Ignore this specific error." Or maybe "if you have fewer than 5 records that couldn't be retrieved, still consider the run successful."

This would allow us to better balance the competing priorities of "use yesterday's data instead of bad data," and "old data is bad."

Additional context:

Example failure: https://github.com/anchore/grype-db/actions/runs/8730962418/job/23970839142#step:6:1440