Open stevekuznetsov opened 1 year ago
Five hundred feet, I think this is a reasonable change. It's more or less what we want to do anyway now that FBC is a well known format. So a +1 to the concept.
That being said, I think version skew just becomes an issue here over time and we need to be careful about it. You mentioned the hash generation, but to me that speaks to a larger problem with the way that most folks currently ship catalogs. There is an expectation that catalogs work across cluster versions -- for example in Openshift, it's a pretty common pattern for disconnected clusters to upgrade their cluster then remirror the latest version of the catalog image. I wonder if we need to have a way for the catalog source api to be smart enough to know if a given catalog matches its expected format (fbc in the right place, has the same schema version, etc) and if it doesn't should we fail? Or, if there is a binary in the image, should we attempt to start it and serve it that way? Making the api opt in seems like a middleground, but 99% of the time users aren't actually controlling these fields, they're consuming them through a tool that generated the yaml (oc-mirror for example), as a package (ibm's cloudpak cases), or they're getting it from openshift's cvo.
can we guarantee that older catalogs are built with older opm versions?
I think we have to assume that there's often a diff between the version of opm that built the catalog, the schema version of the fbc, and the current version of the cluster. It's been very useful for us to be able to ship catalogs out of band of openshift patch releases, and making sure that we maintain that feature is really important. Maybe that just means having a very strict fbc schema version and making the cluster schema aware? I don't think we've historically been very strict about versioning the fbc so far.
99% of the time users aren't actually controlling these fields, they're consuming them through
Are we OK if the feature we ship is opt-in and we get all of the places we control to opt into it, with the express understanding of what that entails? I am way more interested in solving these 99% cases than making this feature generic enough to be able to be turned on by default for every catalog.
Maybe that just means having a very strict fbc schema version and making the cluster schema aware? I don't think we've historically been very strict about versioning the fbc so far.
It's either this, or making the server not read the files it's serving to folks, if it's not going to be doing anything to the FBC data on disk before serving anyway.
Are we OK if the feature we ship is opt-in and we get all of the places we control to opt into it, with the express understanding of what that entails? I am way more interested in solving these 99% cases than making this feature generic enough to be able to be turned on by default for every catalog.
I think that makes sense to me. I'd be interested to hear some other perspectives on how these catalog images are being distributed, but at least for openshift's default case it seems like this should work fine.
It's either this, or making the server not read the files it's serving to folks, if it's not going to be doing anything to the FBC data on disk before serving anyway.
Agreed. Today, though, we don't actually have an explicit versioning scheme for FBC, so it might make sense for us to invent one for this purpose.
Agreed. Today, though, we don't actually have an explicit versioning scheme for FBC, so it might make sense for us to invent one for this purpose.
FBC is intended to be versioned by component sub-schema, for e.g. if we need to extend olm.package
we can define an olm.package.v2
schema which replaces it. I'm hand-waving past all of the "order-of-precedence and support between v2 and original versions" in the context of this conversation.
We could also write a subroutine into opm
that checks that all the schema versions it finds on disk are known and serve-able, and if not, exit with some well-known status. On that status we can fall back to using the opm
in the index?
IMO, any v1
version of opm
should be backward compatible wrt FBC schema support. So as long as the opm serve
version is equal to or newer than the version of opm
used to render/build the FBC, then that should always work (until we release opm
v2
, at least)
Today, when a
CatalogSource
is created, thePod
that serves catalog content usesopm
to do so; the version of the server that's used is whichever version happens to be bundled into the index image. This design has a couple of draw-backs:opm
are usedopm
requires rebuilding all catalogsopm
requires the sameopm
they might encounter in a catalog and what they can assume of itWhen the data model was tightly coupled to the binary reading it (at the time from SQLite), bundling both the content and the server made a lot of sense. Today, though, it's possible we're not in that state any longer, as the move to FBC has given us a) a stable format and b) a non-obfuscated storage mechanism.
The
catalog-operator
already has an--opmImage
flag to consume a well-known image containingopm
to use, so implementing this feature will only require us to add a new field to theCatalogSource
to opt catalogs into this:Then, the
catalog-operator
can create aPod
that:--opmImage
in a container, passing the two directories to it.Open Questions:
opm
is deserializing the content of the catalog on disk to validate the cache; if new fields are added to FBC, an olderopm
will not be able to understand those fields and will fail to reproduce the hash - what should we do?opm
faithfully serves cache content without pruning unknown fieldsopm
versions?