Open thclark opened 4 years ago
This is one of the use-cases that is fairly awkward to support in skaffold currently. I've just implemented a solution for Django migrations so I thought I'd share the hoops I had to jump through for context.
For a long time I've been using initContainers to run my DB migrations in Django. You don't want to use initContainers in your main workload Deployment, because you could have multiple replicas, and you don't want to run multiple migration jobs in parallel. So I created another deployment for my "canary" which has an initContainer running the DB migration, and only ever has one replica.
An alternative would be to specify a k8s Job
that runs the migration, but running repeat jobs in kustomize is messy (see https://github.com/kubernetes-sigs/kustomize/issues/168).
Having created a separate deployment that needs to mount the same configmaps/secrets as the main workload, you'll then have to split out your shared config in an overlay, with app and migration each using the base and config as their bases. This is a bit little bit hard to reason about, but it's not terrible.
base/kustomization.yaml
overlays/env/config/kustomization.yaml
overlays/env/app/kustomization.yaml
overlays/env/migration/kustomization.yaml
I also set up two skaffold
profiles, one for the app
kustomization and one for the migration
kustomization. It would be nice to just use multiple kustomizations (https://github.com/GoogleContainerTools/skaffold/issues/1872) in a single profile, but in general you need to wait for the first one to completely roll out before you trigger the second one (can't start your API before you've run your DB migrations). So multiple kustomizations in a single profile would only work if you could specify dependencies between the build steps, like "kustomization A must complete before we start deploying kustomization B".
And now duplicate the two profiles for each environment (review-apps, dev, staging, prod) and this is a lot of ugly boilerplate. To be fair I'm not sure it would be much nicer in Helm...
It is in principle possible to structure your migrations so that they are safe to run after the api rolls out, but that's a lot of work, and it's undesirable to have to contort the application / development workflow to satisfy the constraints of the infrastructure.
There might also be fruitful avenues of exploration in the new Ephemeral Containers feature: https://kubernetes.io/docs/concepts/workloads/pods/ephemeral-containers/.
Adding for discussion.
FWIW, I think that adding first-class Skaffold support for running repeated k8s Jobs might be a better/more flexible way of solving this usecase than doing a kubectl exec
, since in principle your Job might want to run in a different container, and mount different secrets/configmaps. But just adding a field to the Skaffold profile to specify which pod:container should be the exec target could work here.
Another lens to consider this through -- a major consumer of Skaffold is Google's Cloud Code, which has VSCode and Jetbrains IDE integrations: https://cloud.google.com/code/docs/intellij/deploying-a-k8-app. A generic way of running commands in the k8s cluster that your dev environment is pointing to would be valuable for Cloud Code as well. In that case, just having a Skaffold profile for the migration job might be sufficient, as long as you could pass in an arbitrary command to that profile. But having some way to run jobs in the context of an existing profile (without needing to define an additional "job profile") would be a better developer experience. This points more towards exec/ephemeral containers as the best solution.
this will be solved by the addition of pre- and post- stage hooks, which we're finalizing a design for right now. we'll send the design out publicly for review as soon as it's ready, and we'll likely target the end of the year for shipping the feature, though this is subject to change.
@nkubala I'm not quite sure how pre/post hooks would fix this. You could use a pre
hook to always apply migrations when starting, which is useful... But I don't think this is quite what I'm getting at here in terms of the need to run a different actual command in the container?
@thclark ok I see, so it seems like you're really trying to override/append the entrypoint of the resource you're creating. do you commonly do this by changing the args in the podspec in the actual k8s yaml? if so you might be able to just use a combination of kustomize and profiles to give yourself different "entrypoints" which are almost like different modes of deploying your app.
I also wonder if using a post-deploy hook against a service (rather than a pod) would help here. are these commands something you can do after your "deploy" is finished, or do they need to be part of it?
@thclark ok I see, so it seems like you're really trying to override/append the entrypoint of the resource you're creating. do you commonly do this by changing the args in the podspec in the actual k8s yaml? if so you might be able to just use a combination of kustomize and profiles to give yourself different "entrypoints" which are almost like different modes of deploying your app.
I did something like that to try and get started, but you need the flexibility of doing it at the command line, or you end up with hundreds and hundreds of profiles (one for each command you might otherwise type in a cli).
I also wonder if using a post-deploy hook against a service (rather than a pod) would help here. are these commands something you can do after your "deploy" is finished, or do they need to be part of it?
It's not a consistent need... They need to be ad-hoc for this to be a tool for developers, as opposed to a tool to do devops with, where everything needs to e the opposite of ad-hoc! :)
Not sure how helpful the following is, but I'm conscious of being a k8s newbie so have tried to write a user workflow (hopefully understandable by non-django/flask/whatever developers) to explain a typical working pattern...
<develop develop develop. Need another column in a db table. Edit some code in my models.py file in django, which is what defines my database schema.>
<run server management command, which needs access to the entire context (e.g. live database services etc) to make migrations files, specifying the folder name as an argument>
<run server management command, which needs access to the entire context (e.g. live database services etc) to apply migrations to the db>
<continue developing. Oh hell, I needed that boolean column to be default false, not true...>
<shut down server, run server management command, which needs access to the entire context (e.g. live database services etc), telling it exactly the migrations I want to roll back>
<delete the migrations file, alter the model> lather, rinse, repeat...
Reducing priority since there hasn't been any activity and the team does not plan to work on it in near future.
Scenario
The (related) possibility of a
skaffold exec
command has been discussed before (eg in #252 ) but in that case it added little, and was kind of discussed in the abstract.I feel like there are some concrete use cases for a
run --command
orexec
style addition to the CLI, which causes skaffold toBackground
I'm trying to redeploy a django app using skaffold. So I have a 'dev' setup which runs postgres, redis, pgadmin, and of course the webserver.
In django, part of the development workflow is to execute utilities with management commands. Running a management command essentially starts the application, does the thing, then stops the application.
The obvious one is
runserver
and so myweb-deployment.yaml
file has the sectionAll I do is
skaffold dev --port-forward
and the database, cache, everything comes up beautifully.But, another common one is
python manage.py makemigrations
which starts the server, checks the database's state, then makes database migration files. Other commands reset the database, flush caches, open up a shell into the django environment, etc etc...These require resources like the database to be up but not the 'default' server. They may even require no other server not to be running, e.g. if doing surgery on the database.
Expected behavior
Something like
skaffold run --command web -- python manage.py makemigrations
to start a cluster and instead of the default command on the web container, use the specified one.Actual behavior
No option to do so.
Mostly, the work around is:
skaffold dev