GoogleContainerTools / skaffold

Easy and Repeatable Kubernetes Development
https://skaffold.dev/
Apache License 2.0
14.85k stars 1.62k forks source link

Feature request: skaffold run --command (or exec) #4179

Open thclark opened 4 years ago

thclark commented 4 years ago

Scenario

The (related) possibility of a skaffold exec command has been discussed before (eg in #252 ) but in that case it added little, and was kind of discussed in the abstract.

I feel like there are some concrete use cases for a run --command or exec style addition to the CLI, which causes skaffold to

Background

I'm trying to redeploy a django app using skaffold. So I have a 'dev' setup which runs postgres, redis, pgadmin, and of course the webserver.

In django, part of the development workflow is to execute utilities with management commands. Running a management command essentially starts the application, does the thing, then stops the application.

The obvious one is runserver and so my web-deployment.yaml file has the section

    metadata:
      labels:
        module: web
    spec:
      containers:
      - name: django
        image: octue/drift
        args:
          - python
          - manage.py
          - runserver
          - 0.0.0.0:8000

All I do is skaffold dev --port-forward and the database, cache, everything comes up beautifully.

But, another common one is python manage.py makemigrations which starts the server, checks the database's state, then makes database migration files. Other commands reset the database, flush caches, open up a shell into the django environment, etc etc...

These require resources like the database to be up but not the 'default' server. They may even require no other server not to be running, e.g. if doing surgery on the database.

Expected behavior

Something like skaffold run --command web -- python manage.py makemigrations to start a cluster and instead of the default command on the web container, use the specified one.

Actual behavior

No option to do so.

Mostly, the work around is: skaffold dev

`kubectl get pods` `kubectl exec -- python manage.py makemigrations` to kill But this doesn't work for commands where another server cannot be connected to the database... and is a screaming agony of a workflow, because the pod name changes every time you type anything. ### Information / Steps to reproduce the behavior I can create a repo with a detailed example, if it helps. ### A grander vision This feature could actually propel a whole bunch more use cases for skaffold, because at that point you would have an elegant way of creating command line tools that operate with specific sets of resources locally and are redeployable to the cloud trivially. This could be great for scientific research - scientists tend not to be devops experts at all. They'll hack around some code, run, re-run, re-re-run, re-re-re-run it locally, get some cool analyses running - then face an extraordinary uphill struggle turning that into a deployed webservice. But supplying a template with skaffold in it would allow them still to monkey around locally for R&D purposes, but actually on a well defined, stable and portable infrastructure.
paultiplady commented 4 years ago

This is one of the use-cases that is fairly awkward to support in skaffold currently. I've just implemented a solution for Django migrations so I thought I'd share the hoops I had to jump through for context.

For a long time I've been using initContainers to run my DB migrations in Django. You don't want to use initContainers in your main workload Deployment, because you could have multiple replicas, and you don't want to run multiple migration jobs in parallel. So I created another deployment for my "canary" which has an initContainer running the DB migration, and only ever has one replica.

An alternative would be to specify a k8s Job that runs the migration, but running repeat jobs in kustomize is messy (see https://github.com/kubernetes-sigs/kustomize/issues/168).

Having created a separate deployment that needs to mount the same configmaps/secrets as the main workload, you'll then have to split out your shared config in an overlay, with app and migration each using the base and config as their bases. This is a bit little bit hard to reason about, but it's not terrible.

base/kustomization.yaml
overlays/env/config/kustomization.yaml
overlays/env/app/kustomization.yaml
overlays/env/migration/kustomization.yaml

I also set up two skaffold profiles, one for the app kustomization and one for the migration kustomization. It would be nice to just use multiple kustomizations (https://github.com/GoogleContainerTools/skaffold/issues/1872) in a single profile, but in general you need to wait for the first one to completely roll out before you trigger the second one (can't start your API before you've run your DB migrations). So multiple kustomizations in a single profile would only work if you could specify dependencies between the build steps, like "kustomization A must complete before we start deploying kustomization B".

And now duplicate the two profiles for each environment (review-apps, dev, staging, prod) and this is a lot of ugly boilerplate. To be fair I'm not sure it would be much nicer in Helm...

It is in principle possible to structure your migrations so that they are safe to run after the api rolls out, but that's a lot of work, and it's undesirable to have to contort the application / development workflow to satisfy the constraints of the infrastructure.

There might also be fruitful avenues of exploration in the new Ephemeral Containers feature: https://kubernetes.io/docs/concepts/workloads/pods/ephemeral-containers/.

tstromberg commented 4 years ago

Adding for discussion.

paultiplady commented 4 years ago

FWIW, I think that adding first-class Skaffold support for running repeated k8s Jobs might be a better/more flexible way of solving this usecase than doing a kubectl exec, since in principle your Job might want to run in a different container, and mount different secrets/configmaps. But just adding a field to the Skaffold profile to specify which pod:container should be the exec target could work here.

Another lens to consider this through -- a major consumer of Skaffold is Google's Cloud Code, which has VSCode and Jetbrains IDE integrations: https://cloud.google.com/code/docs/intellij/deploying-a-k8-app. A generic way of running commands in the k8s cluster that your dev environment is pointing to would be valuable for Cloud Code as well. In that case, just having a Skaffold profile for the migration job might be sufficient, as long as you could pass in an arbitrary command to that profile. But having some way to run jobs in the context of an existing profile (without needing to define an additional "job profile") would be a better developer experience. This points more towards exec/ephemeral containers as the best solution.

nkubala commented 4 years ago

this will be solved by the addition of pre- and post- stage hooks, which we're finalizing a design for right now. we'll send the design out publicly for review as soon as it's ready, and we'll likely target the end of the year for shipping the feature, though this is subject to change.

thclark commented 4 years ago

@nkubala I'm not quite sure how pre/post hooks would fix this. You could use a pre hook to always apply migrations when starting, which is useful... But I don't think this is quite what I'm getting at here in terms of the need to run a different actual command in the container?

nkubala commented 3 years ago

@thclark ok I see, so it seems like you're really trying to override/append the entrypoint of the resource you're creating. do you commonly do this by changing the args in the podspec in the actual k8s yaml? if so you might be able to just use a combination of kustomize and profiles to give yourself different "entrypoints" which are almost like different modes of deploying your app.

I also wonder if using a post-deploy hook against a service (rather than a pod) would help here. are these commands something you can do after your "deploy" is finished, or do they need to be part of it?

thclark commented 3 years ago

@thclark ok I see, so it seems like you're really trying to override/append the entrypoint of the resource you're creating. do you commonly do this by changing the args in the podspec in the actual k8s yaml? if so you might be able to just use a combination of kustomize and profiles to give yourself different "entrypoints" which are almost like different modes of deploying your app.

I did something like that to try and get started, but you need the flexibility of doing it at the command line, or you end up with hundreds and hundreds of profiles (one for each command you might otherwise type in a cli).

I also wonder if using a post-deploy hook against a service (rather than a pod) would help here. are these commands something you can do after your "deploy" is finished, or do they need to be part of it?

It's not a consistent need... They need to be ad-hoc for this to be a tool for developers, as opposed to a tool to do devops with, where everything needs to e the opposite of ad-hoc! :)

Not sure how helpful the following is, but I'm conscious of being a k8s newbie so have tried to write a user workflow (hopefully understandable by non-django/flask/whatever developers) to explain a typical working pattern...

<develop develop develop. Need another column in a db table. Edit some code in my models.py file in django, which is what defines my database schema.>

<run server management command, which needs access to the entire context (e.g. live database services etc) to make migrations files, specifying the folder name as an argument>

<run server management command, which needs access to the entire context (e.g. live database services etc) to apply migrations to the db>

<continue developing. Oh hell, I needed that boolean column to be default false, not true...>

<shut down server, run server management command, which needs access to the entire context (e.g. live database services etc), telling it exactly the migrations I want to roll back>

<delete the migrations file, alter the model> lather, rinse, repeat...

tejal29 commented 2 years ago

Reducing priority since there hasn't been any activity and the team does not plan to work on it in near future.