application-research / estuary-www

https://estuary.tech
https://estuary.tech
Other
35 stars 31 forks source link

Token issues with experimental Estuary-on-Kubernetes setup #75

Open Zorlin opened 1 year ago

Zorlin commented 1 year ago

Hi! Buckle in, this'll be a long one.

I'm experimenting with running Estuary on Kubernetes, basing my work off the estuary-docker repository.

image

The rough design is that slightly modified Docker images (see my docker repo) are built and sent to Docker Hub via GitHub Actions, with the modification at this point simply being replacing the entrypoint for estuary-www with a custom script which ensures the environment variable $ESTUARY_TOKEN is populated from the "standard" location, /usr/estuary/private/token.

From there, I've written some Helm charts which deploy estuary-main and estuary-www to any Kubernetes cluster (as long as it supports RWX-style persistent volumes). Currently those charts are designed around a single copy of each service (ie, will support one main service, one shuttle and one web server), but in theory could support multiple shuttles running on Kubernetes (possibly each with its own individual database, but a shared main data folder, so all shuttles could share all pins and simplify things a bit).

A "secret" called estuary contains some environment variables for configuration's sake, which is loaded by each of the pods later:

image

(I've modified the settings slightly from what the Helm chart ships while debugging, but no settings I've found seem to "solve" this problem - B)

The estuary-main pod has two persistent volumes - one at /usr/src/estuary/data called "data" (imaginative!), and another at /usr/estuary/private called "private". These survive between container restarts. Upon starting for the first time, estuary-main generates all its data as usual, and successfully generates a token and stores it at /usr/estuary/private/token, inside the "private" volume.

As far as I can tell, estuary-main is fully functional inside Kubernetes:


FULLNODE_API_INFO is empty, use default value
--
Sun, Oct 9 2022 9:01:17 am | HOSTNAME: http://estuary-k8s.windowpa.in
Sun, Oct 9 2022 9:01:17 am | FULLNODE_API_INFO: wss://api.chain.love
Sun, Oct 9 2022 9:01:17 am | /usr/src/estuary/data/estuary.db exists.
Sun, Oct 9 2022 9:01:17 am | 2022-10-09T01:01:17.111Z INFO estuary estuary/main.go:512 estuary version: v0.1.9-3-g3716c6a {"app_version": "v0.1.9-3-g3716c6a"}
Sun, Oct 9 2022 9:01:19 am | Wallet address is: f1oovhfy5v3mt2eyf4t6i4lufkmy2n7hvo332fgdi
Sun, Oct 9 2022 9:01:19 am | 2022/10/09 01:01:19 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/lucas-clemente/quic-go/wiki/UDP-Receive-Buffer-Size for details.
Sun, Oct 9 2022 9:01:20 am | 2022-10-09T01:01:20.297Z INFO dt-impl impl/impl.go:145 start data-transfer module
Sun, Oct 9 2022 9:01:20 am | /ip4/10.42.217.162/tcp/6744/p2p/12D3KooWPd1KKKkJ9wkmMCV58QynLfnfkcFNKm1YrZkrp6yKcXon
Sun, Oct 9 2022 9:01:20 am | /ip4/127.0.0.1/tcp/6744/p2p/12D3KooWPd1KKKkJ9wkmMCV58QynLfnfkcFNKm1YrZkrp6yKcXon
Sun, Oct 9 2022 9:01:20 am | 2022-10-09T01:01:20.297Z INFO estuary estuary/replication.go:714 rebuilding staging buckets....... {"app_version": "v0.1.9-3-g3716c6a"}
Sun, Oct 9 2022 9:01:20 am | 2022-10-09T01:01:20.298Z INFO estuary estuary/pinning.go:166 trying to refresh pin queue for local contents {"app_version": "v0.1.9-3-g3716c6a"}
Sun, Oct 9 2022 9:01:20 am | 2022-10-09T01:01:20.299Z INFO dt-impl impl/impl.go:145 start data-transfer module
Sun, Oct 9 2022 9:01:20 am | 2022-10-09T01:01:20.302Z INFO estuary estuary/replication.go:759 rebuilding contents queue ....... {"app_version": "v0.1.9-3-g3716c6a"}
Sun, Oct 9 2022 9:01:20 am |  
Sun, Oct 9 2022 9:01:20 am | ____ __
Sun, Oct 9 2022 9:01:20 am | / __/___/ / ___
Sun, Oct 9 2022 9:01:20 am | / _// __/ _ \/ _ \
Sun, Oct 9 2022 9:01:20 am | /___/\__/_//_/\___/ v4.6.1
Sun, Oct 9 2022 9:01:20 am | High performance, minimalist Go web framework
Sun, Oct 9 2022 9:01:20 am | https://echo.labstack.com
Sun, Oct 9 2022 9:01:20 am | ____________________________________O/_______
Sun, Oct 9 2022 9:01:20 am | O\
Sun, Oct 9 2022 9:01:20 am | ⇨ http server started on [::]:3004

The estuary-www pod mounts the "private" volume as read-only at /usr/estuary/private, and is successfully able to read the token. From there, it proceeds to load that token and set it as an environment variable via export ESTUARY_TOKEN=$(cat /usr/estuary/private/token), prints the command it's about to run including token, then starts the webserver via npm run dev-docker, passing the $ESTUARY_HOST and $ESTUARY_TOKEN env vars as appropriate:

Loading Estuary token from token file
Estuary hostname is http://estuary-k8s.windowpa.in
Estuary API is http://estuary-main.estuary:3004
DEBUG: npm run dev-docker --estuary-host=http://estuary-k8s.windowpa.in --estuary-api-key=EST7SLIGHTLYREDACTEDTOKENARY

> estuary-www@0.0.5 dev-docker /usr/src/estuary-www
> next dev -p 4444

ready - started server on 0.0.0.0:4444, url: http://localhost:4444
event - compiled client and server successfully in 3.6s (174 modules)
wait  - compiling / (client and server)...
event - compiled client and server successfully in 1527 ms (245 modules)
{
  error: {
    code: 401,
    reason: 'ERR_INVALID_TOKEN',
    details: 'api key does not exist'
  }
}

The issue I'm running into is that estuary-www doesn't seem to be sending the token it's being fed, despite being pretty explicit about it. It works perfectly in estuary-ansible running as a normal setup outside of Docker and Kubernetes, so I've gotten estuary-www working at least once before 🙂 but in this case it appears to be sending a blank/undefined token string to estuary-main, as noted by the logs in estuary-main:

2022-10-09T01:01:32.706Z ERROR util util/http.go:149 handler error: ERR_INVALID_TOKEN: api key does not exist
--
Sun, Oct 9 2022 9:01:32 am |  
Sun, Oct 9 2022 9:01:32 am | 2022/10/09 01:01:32 /usr/src/estuary/handlers.go:2737 record not found
Sun, Oct 9 2022 9:01:32 am | [2.775ms] [rows:0] SELECT * FROM `auth_tokens` WHERE token = "undefined" AND `auth_tokens`.`deleted_at` IS NULL ORDER BY `auth_tokens`.`id` LIMIT 1
Sun, Oct 9 2022 9:01:32 am | 2022-10-09T01:01:32.707Z ERROR util util/http.go:149 handler error: ERR_INVALID_TOKEN: api key does not exist
Sun, Oct 9 2022 9:01:32 am |  
Sun, Oct 9 2022 9:01:32 am | 2022/10/09 01:01:32 /usr/src/estuary/handlers.go:2737 record not found
Sun, Oct 9 2022 9:01:32 am | [2.594ms] [rows:0] SELECT * FROM `auth_tokens` WHERE token = "undefined" AND `auth_tokens`.`deleted_at` IS NULL ORDER BY `auth_tokens`.`id` LIMIT 1

The estuary-www service is exposed via an Ingress object to http://estuary-k8s.windowpa.in:80, but the issue is the same whether I access estuary-www itself via a port forward or via browsing to that Ingress (and indeed the errors are triggered without hitting it with a browser anyways as it tries to load /viewer)

One thing that does work is port-forwarding port 3004 from the container so that it's available as "localhost:3004" on my Mac, then running estuary-www on the Mac using the exact same token and visiting that Estuary frontend. This indicates very strongly that the token is fine, there's just some weird mishap happening.

Via @jimmylee: Your estuary node is probably running locally, you have a port for it, and most likely its running on localhost The estuary-www configuration is designed to be pointed at a hosted Estuary node So I assume most people do a cmmd+f and find the parts where it is hardcoded and just point it directly at either 0.0.0.0:8888 or whatever port their Estuary node is at. It would be nice to get this right from a configuration standpoint, but my question is, did you try that? And did that still not work?

In my case, it's running on port 3004 on an IP address internal to Kubernetes, and also exposed as a Service with a DNS name of "estuary-main", also on port 3004 - two distinct objects, though, as the service goes through Kubeproxy.

This makes it accessible to other pods (a slightly larger object than containers) running on the Kubernetes cluster as either "http://estuary-main:3004" (if running in the same namespace, a way of separating pods into different concerns) or globally as "http://estuary-main.estuary:3004" (name of service . namespace : port).

I've confirmed that from the container running estuary-www can hit the endpoint /viewer with the stored API key referenced earlier:

root@estuary-www-bdcdbc4fb-m47dr:/usr/src/estuary-www# curl -H "Authorization: Bearer $(cat /usr/estuary/private/token)" http://estuary-main:3004/viewer 
{"username":"admin","perms":100,"id":1,"address":"\u003cempty\u003e","auth_expiry":"2023-10-08T18:38:13.707159109Z","settings":{"replication":6,"verified":true,"dealDuration":1494720,"maxStagingWait":28800000000000,"fileStagingThreshold":3835271577,"contentAddingDisabled":false,"dealMakingDisabled":false,"uploadEndpoints":["http://localhost:3004/content/add"],"flags":0}}

I tried using the command line flags and some environment variables, not editing estuary-www directly.

Zorlin commented 1 year ago

I've made huge progress today and think I've pretty much solved this one.

Essentially, it boiled down to a few problems:

Many tweaks later, we have...

image

Estuary on Kubernetes, minus shuttles.