Closed wagoodman closed 5 days ago
hi @wagoodman ,
are we expecting the scans results to change from v5 to v6?
No, there is no functional change in the match results as currently planned out. There should be the same match results. This could change if we fix any matcher bugs while we're rewriting that portion of the search
package though. But we would not merge changes that made results worse, only better.
will the db be stateful?
It will be more stateful than it is today. Most information will be in the Blobs
table and be content addressable. We wont be keeping the value digests in the DB to save on space, but it would be possible to digest all values in that table and compare digests computed from one DB and another DB to get a rough sense on the changes. We cannot use the blob IDs themselves sense they cannot be considered stable between DB builds.
how easy it will be to add new tables?
Since there would be new Handle
tables for searching and new Blob
schemas for entries that are searchable by a handle table, at any point during the v6 lifetime we can add these new elements without it being a breaking change or a paradigm shift. Take your example for EOL, I imagine there would be a new EolBlob
that represents a json payload with cpe, purl, version, date, etc... as well as a new EolHandle
table that is indexed by cpe or purl. It might be that we add new entries into the Packages
table for joining on too (optionally, depending on what kind of data we pull in from that dataset).
will it effect grype's schema (v6 = grype 1.0.0)?
grype is still in v0 so we're making breaking changes on minor releases still. However, v6 schema is a big step towards grype 1.0. We still need to determine if we want to make the DB schema part of the grype version contract (e.g. should we allow for breaking schema changes and not need to bump the major version of grype?) -- that still hasn't been decided.
DB v6 is meant to cover several use cases (you can safely ignore this link). The high level goals are:
The high level design is as follows:
Here are a list of the “Handle” tables to search against:
These are related to two other auxiliary tables:
Here's how they AffectedPackageHandle table would relate to auxiliary tables:
And the rather simple Blobs table:
Here's how you might craft a search for an affected package for a specific OS:
At this point you can take these blob IDs and query the blob table for the JSON payload and deserialize. This has an advantage over the existing schemas: you can conditionally inflate DB objects based on what you need, not have to inflate entire records that you end up throwing away.
In v1-5 you'd need to craft the correct namespace, which was a bespoke string --this shifts this to relations per-record.
Eventually we’d like to add additional handle tables (out of scope for v6 though):
The proposed blobs are as follows:
Implied changes from this:
Specific changes (see prototype models for reference):
VulnerabilityHandle
gorm model #2243OperatingSystem
gorm model #2245Package
gorm model #2245AffectedPackageHandle
gorm model #2245Provider
gorm model #2232DbMetadata
gorm model #2146Blob
gorm model #2243