Add aggregated apiserver for catalog content

ncdc commented 9 months ago

The current client interactions with catalogd are problematic in a few ways:

The APIs are not accessible via the main Kubernetes apiserver
The APIs are not accessible outside the cluster's pod and service networks unless you have some kind of ingress
The APIs only allow retrieving the entire catalog

Adding an aggregated apiserver solves the first 2 issues. Adding built in type(s) for catalog content at a level more granular than "the entire catalog" solves the 3rd issue.

everettraven commented 9 months ago

I'm not against this, but just thought I'd link the RFC where we decided to go the route of using an HTTP server instead of an aggregated API server: https://docs.google.com/document/d/1exozVIb1dOQ08up5iXz4D6MAtzSyE3W8rV5_FKmds5I/edit?usp=sharing

I'm mainly calling this out because we've been down this path before and decided against it. What makes this time different? How do we plan to solve the pain points of using an aggregated API server that led us down the path of using an HTTP server?

The APIs are not accessible via the main Kubernetes apiserver The APIs are not accessible outside the cluster's pod and service networks unless you have some kind of ingress

While I get the HTTP server results in not being able to use kubectl natively to fetch the catalog contents an alternative is to focus on client side tooling that makes this a significantly better user experience than exists currently and is similar to using kubectl. Just one example is a kubectl plugin I wrote for catalogd here: https://github.com/everettraven/kubectl-catalogd

The APIs only allow retrieving the entire catalog

This was only meant to be the state initially until there was data backed by users that they prefer to have multiple different endpoints/request-response patterns. Nothing is stopping us from extending the functionality of the HTTP server to handle more nuanced requests and return appropriate responses.

ncdc commented 9 months ago

It's a better, more consistent UX if you can kubectl get packages and kubectl get extensions, vs. kubectl <some plugin> <some command to get packages> and kubectl get extensions.

Is an aggregated apiserver required? No. But it's the least barrier to entry, because it doesn't require port-forwarding, apiserver service proxying, or ingress.

everettraven commented 9 months ago

Have any of the requirements that led us to using the HTTP server over an aggregated apiserver been relaxed? Namely atomic updates? IIRC the requirements at the time the RFC I linked was written and the decision was made to use an HTTP server was because we would likely end up building an apiserver that doesn't meet k8s conformance and also result in a degraded user experience.

ncdc commented 9 months ago

We can do atomicity with an aggregated apiserver, when coded properly (I think).

everettraven commented 9 months ago

Adding to my previous comments, are we now leaning in the direction that catalogd is opinionated on the content that it is serving? You mention using something like kubectl get packages but IIRC there was a goal of keeping catalogd unaware of OLM specific concepts and focus on serving pure FBC (not saying this can't be done, we attempted this with the old CatalogMetadata API) and I'm wondering if there are thoughts around changing the original intended behavior?

ncdc commented 9 months ago

If not, then we'll presumably not do this 😄

ncdc commented 9 months ago

My 2 cents: let's code what is approachable and matches our needs as closely as possible, rather than starting with generic/opaque things. So yes, let's serve packages or whatever makes the most sense.

everettraven commented 9 months ago

We can do atomicity with an aggregated apiserver, when coded properly (I think).

You likely know more than I do in this regard, but there is this comment thread from the RFC where @stevekuznetsov explained some of the anti-patterns we are facing when trying to implement atomicity with an aggregated apiserver: https://docs.google.com/document/d/1exozVIb1dOQ08up5iXz4D6MAtzSyE3W8rV5_FKmds5I/edit?disco=AAAA0tANcl8

everettraven commented 9 months ago

Another comment thread regarding the importance of atomicity in this case: https://docs.google.com/document/d/1exozVIb1dOQ08up5iXz4D6MAtzSyE3W8rV5_FKmds5I/edit?disco=AAAA0tANcjo

stevekuznetsov commented 9 months ago

@ncdc there's no way to allow WATCH on the resources - Kube semantics won't allow more than one resource to change in one resourceVersion tick and if you're building a client-side cache of state in a lister, a WATCH during catalog upgrade will leave you in some intermediate half-previous half-upgraded list of possible packages during that process.

ncdc commented 9 months ago

Going to close this as the complexity isn't worth it. Thanks for the discussion!

operator-framework / catalogd

Add aggregated apiserver for catalog content #229