Closed izaaklauer closed 2 years ago
Good bug find. This is confusing because there are supposed to be two different config watchers happening. I'll attempt to explain why, but please ask more questions if this is confusing.
The first config watcher ("first" being the one you mention first in your issue, this one) is responsible for runner configuration. These are the variables (environment variables and files) that are specified for the runner to have. For example, you can use this to set AWS access credentials for all jobs a runner may execute. These are set using the waypoint config set -runner
command and flag.
The second config watcher (here) is for the job configuration. These are the variables that are specified in the waypoint.hcl
file or via the API for a specific job. These might overwrite the runner configuration, but are meant to be scoped to a single job.
I purposely made two watchers for a couple reasons:
I think the best long term solution would be to change the RunnerJobStream
stream to send down config source configs (similar to how RunnerConfig
does for the runner). These can be sent perhaps even as part of the assignment
event, or maybe its a new event.
Either way, the important invariant that must be held is that the config sources must be received before the job attempts to load the variables. My recommendation here would be prior to executeJob
is ever called.
Why is this the best?
An alternate solution is to just store the config sources on the runner, and then read them per job and set them up prior to instantiating the appconfig watcher for a job.
This is what you proposed: the appconfig watcher for the runner can be used for jobs.
I would say this is only safe is the runner executes one job at a time. If different jobs have slightly different variable configurations with the same name, itll be tricky to synchronize them. Practically, especially with ODR, we only execute one job per runner so this practically works, but we've always architected runners to be able to take multiple jobs concurrently so theoretically this is unsafe and challenging.
I would recommend against this. I think even from an implementation standpoint, either of the other two proposed solutions I can think of would resolve this more easily with less pitfalls and more future strength.
Describe the bug
Currently, config-sourcer plugin invocations that happen as a result of a dynamic default stanza don't get any of the config set with
waypoint config source-set
.i.e. If you configure a vault config-sourcer:
And then add a var with a dynamic default in your waypoint.hcl:
When the vault config-sourcer runs (in a remote or odr runner), it doesn't use the configured address - it defaults to localhost:
Root cause
Here's my initial read on the root cause:
There's a background process, the appconfig watcher, that's responsible for instantiating and configuring the configsourcer plugins.
We're actually starting two config watchers in the runner - one when we start the runner (here), and one as part of loading dynamic variables (here).
As part of RunnerConfigStream, the server sends the runner configsourcer config. The first watcher gets fed those events here, and updates the plugins to have the right config (i.e. restarts the vault plugin with the right config addr).
The second watcher pathway calls
UpdateSources
.To fix this, we should probably have just one config watcher. It's a long-lived background process anyway. We could store the initial watcher on the runner, and re-use it to render dynamic variable defaults. I haven't dug into if that would cause other weird side-effects though.
We could instead patch this fast by just calling
GetConfigSource
to fetch the config-sourcer config from the server after starting the variables watcher, and immediately callUpdateSources
.