Open davenquinn opened 1 month ago
At some level this problem is analogous to some of the API design/dependency issues created by the PostgREST APIs. Architectural input is appreciated.
We've tried to automatically apply these fixtures on tileserver spin-up before, but we've had problems with table locks since we've run that SQL synchronously … Figure out how to get the fixtures to run properly on tileserver spin-up.
I think I need to hear more details on what exactly is contending for the table locks. In the abstract, I'd have two tasks that can be run completely independently of each other:
Then, we run these tasks in sequence, one after the other. In a k8s Pod definition, (1) becomes an init_container
and (2) becomes a normal container
. K8s will guarantee that (1) completes successfully before (2). The process of applying the fixtures shouldn't step on anything else's toes (because the tileserver proper isn't running yet), and the tileserver can simply assume that the database is laid out properly when it starts (so it doesn't to worry about sequencing start-up tasks before serving "real" requests).
@brianaydemir in this design would it also be possible to apply the fixtures "on demand" while the pods are running? Right now we tend to apply database fixtures by command-line since they require a fairly long toolchain.
I guess a related question is — what is the right way to run migrations (including the more general ones produced by @mwestphall is working on) in production? Is it in some sort of containerized workflow as you laid out above, ensuring that the "right" state of the database is present at all times? Or do we simply run migrations manually and update containers in coordination with that? I guess it gets complicated because different containers will depend on different "subsystems" to be at their most up-to-date schema...
@davenquinn As long as the database is up and running, the only things stopping you from making changes to it are the other services trying to make use of it in its current state.
I guess one thing to consider is the way we run K8s and services on it is that we have a(nother) Git repository that declaratively specifies what the state of the services on the cluster should be, so there's a general expectation that everything needed to get a service up and running is in that YAML. So, if a service requires the database be in a particular state, the service and its definition would, ideally, ensure that.
There are several sets of database fixtures needed for the tileserver to operate correctly. These include a variety of resources each with a different level of coupling to the tileserver Python code:
carto
andmaps
layers)These need to be created for the tileserver to operate properly. However, there are a few factors to consider with this design:
paleogeography
layers) are defined in the Macrostrat repository. There is a two-way dependency between the tileserver code and that schema.It seems like the right approach would be to either