Open Zorlin opened 1 year ago
I've made huge progress today and think I've pretty much solved this one.
Essentially, it boiled down to a few problems:
getViewerFromHeader
--
Mon, Oct 10 2022 10:34:37 am | {
Mon, Oct 10 2022 10:34:37 am | host: '10.42.217.235:4444',
Mon, Oct 10 2022 10:34:37 am | 'user-agent': 'kube-probe/1.22',
Mon, Oct 10 2022 10:34:37 am | accept: '*/*',
Mon, Oct 10 2022 10:34:37 am | connection: 'close'
Mon, Oct 10 2022 10:34:37 am | } ESTUARY_TOKEN
Many tweaks later, we have...
Estuary on Kubernetes, minus shuttles.
Hi! Buckle in, this'll be a long one.
I'm experimenting with running Estuary on Kubernetes, basing my work off the estuary-docker repository.
The rough design is that slightly modified Docker images (see my docker repo) are built and sent to Docker Hub via GitHub Actions, with the modification at this point simply being replacing the entrypoint for estuary-www with a custom script which ensures the environment variable $ESTUARY_TOKEN is populated from the "standard" location, /usr/estuary/private/token.
From there, I've written some Helm charts which deploy estuary-main and estuary-www to any Kubernetes cluster (as long as it supports RWX-style persistent volumes). Currently those charts are designed around a single copy of each service (ie, will support one main service, one shuttle and one web server), but in theory could support multiple shuttles running on Kubernetes (possibly each with its own individual database, but a shared main data folder, so all shuttles could share all pins and simplify things a bit).
A "secret" called
estuary
contains some environment variables for configuration's sake, which is loaded by each of the pods later:(I've modified the settings slightly from what the Helm chart ships while debugging, but no settings I've found seem to "solve" this problem - B)
The estuary-main pod has two persistent volumes - one at /usr/src/estuary/data called "data" (imaginative!), and another at /usr/estuary/private called "private". These survive between container restarts. Upon starting for the first time, estuary-main generates all its data as usual, and successfully generates a token and stores it at /usr/estuary/private/token, inside the "private" volume.
As far as I can tell, estuary-main is fully functional inside Kubernetes:
The estuary-www pod mounts the "private" volume as read-only at /usr/estuary/private, and is successfully able to read the token. From there, it proceeds to load that token and set it as an environment variable via
export ESTUARY_TOKEN=$(cat /usr/estuary/private/token)
, prints the command it's about to run including token, then starts the webserver vianpm run dev-docker
, passing the $ESTUARY_HOST and $ESTUARY_TOKEN env vars as appropriate:The issue I'm running into is that estuary-www doesn't seem to be sending the token it's being fed, despite being pretty explicit about it. It works perfectly in estuary-ansible running as a normal setup outside of Docker and Kubernetes, so I've gotten estuary-www working at least once before 🙂 but in this case it appears to be sending a blank/undefined token string to
estuary-main
, as noted by the logs in estuary-main:The estuary-www service is exposed via an Ingress object to http://estuary-k8s.windowpa.in:80, but the issue is the same whether I access estuary-www itself via a port forward or via browsing to that Ingress (and indeed the errors are triggered without hitting it with a browser anyways as it tries to load /viewer)
One thing that does work is port-forwarding port 3004 from the container so that it's available as "localhost:3004" on my Mac, then running estuary-www on the Mac using the exact same token and visiting that Estuary frontend. This indicates very strongly that the token is fine, there's just some weird mishap happening.
In my case, it's running on port 3004 on an IP address internal to Kubernetes, and also exposed as a Service with a DNS name of "estuary-main", also on port 3004 - two distinct objects, though, as the service goes through Kubeproxy.
This makes it accessible to other pods (a slightly larger object than containers) running on the Kubernetes cluster as either "http://estuary-main:3004" (if running in the same namespace, a way of separating pods into different concerns) or globally as "http://estuary-main.estuary:3004" (name of service . namespace : port).
I've confirmed that from the container running
estuary-www
can hit the endpoint /viewer with the stored API key referenced earlier:I tried using the command line flags and some environment variables, not editing estuary-www directly.