anchore / grype

A vulnerability scanner for container images and filesystems
Apache License 2.0
8.58k stars 559 forks source link

Fix: Fully validate `vulnerability.db` by hash #1976

Open joshuai96 opened 3 months ago

joshuai96 commented 3 months ago

Fixes: #1975 Fixes: #1648

Setting ValidateByHashOnStart to true by default, enables the database curator to execute ValidateByHash in validateIntegrity and report an invalid database.

$ cd $HOME/.cache/grype/db/5/
$ mv vulnerability.db vulnerability.db.bak

# with an invalid db
$ tail -n 50 vulnerability.db.bak > vulnerability.db
$ grype db status
Location:  /home/joshua/.cache/grype/db/5
Built:     2024-07-03 01:30:39 +0000 UTC
Schema:    5
Checksum:  sha256:a7920011e5de7de5e0acafb614f94c38dc4353d34784de07cf16fa765c07904f
Status:    invalid
bad db checksum (/home/joshua/.cache/grype/db/5/vulnerability.db): "sha256:a7920011e5de7de5e0acafb614f94c38dc4353d34784de07cf16fa765c07904f" vs "sha256:d3b1df971b1c7753cb658a2c0cb1046b781bc97f0727682bf6eacada5e92e5c8"

# with missing db
$ rm vulnerability.db
$ grype db status
Location:  /home/joshua/.cache/grype/db/5
Built:     2024-07-03 01:30:39 +0000 UTC
Schema:    5
Checksum:  sha256:a7920011e5de7de5e0acafb614f94c38dc4353d34784de07cf16fa765c07904f
Status:    invalid
failed to open file '/home/joshua/.cache/grype/db/5/vulnerability.db': open /home/joshua/.cache/grype/db/5/vulnerability.db: no such file or directory

Scans with grype now give a better error too:

# with invalid db
$ grype alpine:latest
 ✔ Vulnerability DB                [no update available]  
 ✔ Parsed image                              sha256:a606584aa9aa875552092ec9e1d62cb98d486f51f389609914039aabd9414687
 ✔ Cataloged contents                               dabf91b69c191a1a0a1628fd6bdd029c0c4018041c7f052870bb13c5a222ae76
   ├── ✔ Packages                        [14 packages]  
   ├── ✔ File digests                    [77 files]  
   ├── ✔ File metadata                   [77 locations]  
   └── ✔ Executables                     [17 executables]  
failed to load vulnerability db: vulnerability database is invalid (run db update to correct): bad db checksum (/home/joshua/.cache/grype/db/5/vulnerability.db): "sha256:a7920011e5de7de5e0acafb614f94c38dc4353d34784de07cf16fa765c07904f" vs "sha256:d3b1df971b1c7753cb658a2c0cb1046b781bc97f0727682bf6eacada5e92e5c8"

# with missing db
$ grype alpine:latest
 ✔ Vulnerability DB                [no update available]  
 ✔ Parsed image                              sha256:a606584aa9aa875552092ec9e1d62cb98d486f51f389609914039aabd9414687
 ✔ Cataloged contents                               dabf91b69c191a1a0a1628fd6bdd029c0c4018041c7f052870bb13c5a222ae76
   ├── ✔ Packages                        [14 packages]  
   ├── ✔ File digests                    [77 files]  
   ├── ✔ File metadata                   [77 locations]  
   └── ✔ Executables                     [17 executables]  
failed to load vulnerability db: vulnerability database is invalid (run db update to correct): failed to open file '/home/joshua/.cache/grype/db/5/vulnerability.db': open /home/joshua/.cache/grype/db/5/vulnerability.db: no such file or directory
wagoodman commented 2 months ago

It was intentional to only validate the DB on download and not on start (for performance reasons) since validating every time could lead to startup performance issues. That being said, I can also see the point where getting grype results with a known bad database is also not ideal, since we can no longer trust those results.

I think ultimately it's the right call to turn on validation by default, however, I feel that it should coincide with attempting to improve performance. Right now we're working on grype DB v6, which is very different than DB v1-5. Part of this work may also switch the distributing sha256 digests to xxh64, which is much faster.

One question I have: I have an anecdote from the past that startup was taking an extra second due to hashing the DB -- what extra time at startup are you seeing with this change @joshuai96 ?

joshuai96 commented 2 months ago

In the setup I described in https://github.com/anchore/grype/issues/1975#issuecomment-2208251742 I've seen grype db update stating Vulnerability database updated to latest version! and then having some files missing for example metadata.json, what leads to failing scan runs. So I think validation on download seems a little bit buggy, too. I am positive this is some kind of freak accident based on network hiccups, as our GitLab CI runners are known to have some networking issue.

You mean the startup before a scan runs, @wagoodman ? I didn't notice any inconvenient increase in time, but I can run some tests, to have some numbers backing that up. IMHO a second or two, has no impact on a scan that usually takes a few seconds anyway, but I'm not as sensitive to time, as this scans run over night in a CI/CD setup.

joshuai96 commented 1 month ago

Hello @wagoodman, sorry for the delay.

I've run some simple timing tests for both versions.

Setup:

Test:

Test No. Unmodified Modified
1 3,632 3,628
2 3,730 3,591
3 3,653 3,640
4 3,749 3,661
5 3,526 3,621
6 3,596 3,577
7 3,707 3,740
8 3,568 3,613
9 3,675 3,662
10 3,585 3,600
11 3,724 3,653
12 3,630 3,588
13 3,582 3,680
14 3,647 3,626
15 3,649 3,673
16 3,657 3,607
17 3,695 3,567
18 3,640 3,608
19 3,902 3,616
20 3,686 3,700
Average 3,662 3,633
Standard deviation 0,079 0,043

Result:

Looking at these numbers it may indicate that checking the database on startup decreased scanning time and made it more stable. But the sample size and incrrease are so small, that this can not be stated with confidence.

kzantow commented 1 month ago

FWIW, enabling this option in the latest grype seems to increase the scan time of alpine:latest by ~7 seconds on my x86 mac.

joshuai96 commented 1 month ago

@wagoodman any idea, why I don't see a slowdown of scans?

wagoodman commented 2 weeks ago

I think we'll include this with the grype DB v6 work (as mentioned above) -- so we won't make this change in v5. Keep this PR open and we can rebase on that work and pull it in 👍