anchore / vunnel

Tool for collecting vulnerability data from various sources (used to build the grype database)
Apache License 2.0
68 stars 25 forks source link

Allow for missing ALAS files in Amazon provider #564

Closed wagoodman closed 4 months ago

wagoodman commented 4 months ago

It seems that there are some irregularities in the amazon vulnerability data, where the RSS feed that lists ALAS records is not synced with the accompanying HTML files with the advisory details. When this happens, on the first failed request for a non-existing ALAS record we get a HTTP 403 response code and the provider bails with error. This has been happening intermittently for weeks and is most prone to happen when there is no pre-existing state for a given record (as with the vunnel quality gate, and less likely with nightly grype-db builds).

This PR changes the behavior to allow for a limited number of HTTP 403s across all requests for ALAS HTML files, printing the summary of failures at the end of processing:

[WARNING] failed to fetch 9 ALAS entries due to HTTP 403 response code
[WARNING]  - https://alas.aws.amazon.com/AL2022/ALAS-2023-282.html
[WARNING]  - https://alas.aws.amazon.com/AL2022/ALAS-2023-283.html
[WARNING]  - https://alas.aws.amazon.com/AL2022/ALAS-2023-284.html
[WARNING]  - https://alas.aws.amazon.com/AL2022/ALAS-2023-285.html
[WARNING]  - https://alas.aws.amazon.com/AL2022/ALAS-2023-286.html
[WARNING]  - https://alas.aws.amazon.com/AL2022/ALAS-2023-287.html
[WARNING]  - https://alas.aws.amazon.com/AL2022/ALAS-2023-288.html
[WARNING]  - https://alas.aws.amazon.com/AL2022/ALAS-2023-289.html
[WARNING]  - https://alas.aws.amazon.com/AL2022/ALAS-2023-290.html
[INFO ] wrote 2955 entries
[INFO ] recording workspace state
[DEBUG] wrote workspace state to ./data/amazon/metadata.json
# (exit code 0)

If the number of 403 responses goes over the allowed amount, then a exception is raised (immediately):

[DEBUG] loading existing ALAS from ./data/amazon/input/2022_html/ALAS-2023-280
[DEBUG] loading existing ALAS from ./data/amazon/input/2022_html/ALAS-2023-281
[DEBUG] http GET https://alas.aws.amazon.com/AL2022/ALAS-2023-282.html
[WARNING] 403 Forbidden: https://alas.aws.amazon.com/AL2022/ALAS-2023-282.html
[WARNING] skipping ALAS-2023-282
[DEBUG] http GET https://alas.aws.amazon.com/AL2022/ALAS-2023-283.html
[WARNING] 403 Forbidden: https://alas.aws.amazon.com/AL2022/ALAS-2023-283.html
[WARNING] skipping ALAS-2023-283
[DEBUG] http GET https://alas.aws.amazon.com/AL2022/ALAS-2023-284.html
[WARNING] 403 Forbidden: https://alas.aws.amazon.com/AL2022/ALAS-2023-284.html
[WARNING] skipping ALAS-2023-284
[DEBUG] http GET https://alas.aws.amazon.com/AL2022/ALAS-2023-285.html
[WARNING] 403 Forbidden: https://alas.aws.amazon.com/AL2022/ALAS-2023-285.html
[WARNING] skipping ALAS-2023-285
[DEBUG] http GET https://alas.aws.amazon.com/AL2022/ALAS-2023-286.html
[WARNING] 403 Forbidden: https://alas.aws.amazon.com/AL2022/ALAS-2023-286.html
[WARNING] skipping ALAS-2023-286
[DEBUG] http GET https://alas.aws.amazon.com/AL2022/ALAS-2023-287.html
[WARNING] 403 Forbidden: https://alas.aws.amazon.com/AL2022/ALAS-2023-287.html
[WARNING] skipping ALAS-2023-287
[DEBUG] http GET https://alas.aws.amazon.com/AL2022/ALAS-2023-288.html
[WARNING] 403 Forbidden: https://alas.aws.amazon.com/AL2022/ALAS-2023-288.html
[WARNING] skipping ALAS-2023-288
[DEBUG] http GET https://alas.aws.amazon.com/AL2022/ALAS-2023-289.html
[WARNING] 403 Forbidden: https://alas.aws.amazon.com/AL2022/ALAS-2023-289.html
[ERROR] error downloading data from https://alas.aws.amazon.com/AL2022/ALAS-2023-289.html
Traceback (most recent call last):
  File "/Users/wagoodman/code/vunnel/src/vunnel/providers/amazon/parser.py", line 118, in _get_alas_html
    raise ValueError(
ValueError: exceeded maximum allowed 403 responses (7) from ALAS requests
[WARNING] failed to fetch 8 ALAS entries due to HTTP 403 response code
[WARNING]  - https://alas.aws.amazon.com/AL2022/ALAS-2023-282.html
[WARNING]  - https://alas.aws.amazon.com/AL2022/ALAS-2023-283.html
[WARNING]  - https://alas.aws.amazon.com/AL2022/ALAS-2023-284.html
[WARNING]  - https://alas.aws.amazon.com/AL2022/ALAS-2023-285.html
[WARNING]  - https://alas.aws.amazon.com/AL2022/ALAS-2023-286.html
[WARNING]  - https://alas.aws.amazon.com/AL2022/ALAS-2023-287.html
[WARNING]  - https://alas.aws.amazon.com/AL2022/ALAS-2023-288.html
[WARNING]  - https://alas.aws.amazon.com/AL2022/ALAS-2023-289.html
[INFO ] wrote 2329 entries
[ERROR] error during update: exceeded maximum allowed 403 responses (7) from ALAS requests
# (exit code 1)

The new configuration is for the amazon provider exclusively, max_allowed_alas_http_403, and defaults to 25 (arbitrary value).