Speed up deployments by rendering and deploying concurrently

iosifnicolae2 commented 1 year ago

Most of our time is spent waiting for helm chart render and deployment to run one by one (this is exceptionally problematic when deploying/rendering tasks is taking a few minutes). We could significally reduce deployment time by processing the required skaffold files in parallel.

Expected behavior

We expect skaffold to render and deploy tasks in parallel.

Actual behavior

Each helm chart is rendered and deployed one by one..

Information

Skaffold version: v2.1.0
Operating system: macOS 13.1 (22C65)
Installed via: Homebrew
Contents of skaffold.yaml:

apiVersion: skaffold/v3
kind: Config
requires:
  - path: ./app1/skaffold.yml
  - path: ./app2/skaffold.yml
  - path: ./app3/skaffold.yml
  - path: ./app4/skaffold.yml
  - path: ./app5/skaffold.yml

...

Steps to reproduce the behavior

git clone https://github.com/iosifnicolae2/skaffold-bug
cd skaffold-bug/skaffold
skaffold run --verbosity debug (you migth need to update build.artifacts.image from each skaffold.yml files)

As you can see the Render and Deploy tasks are synchronius.


Running command: [helm --kube-context cluster.local template app2 skaffold-bug/charts/app -
Running command: [helm --kube-context cluster.local template app2 skaffold-bug/charts/app --post-renderer /opt/homebrew/bin/skaffold --set image.repository=bringes/app1 --set image.tag=XXXXX]  subtask=1 task=Render-post-renderer /opt/homebrew/bin/skaffold --set image.repository=bringes/app1 --set image.tag=XXXXX]  subtask=2 task=Render
Running command: [helm --kube-context cluster.local template app2 skaffold-bug/charts/app --post-renderer /opt/homebrew/bin/skaffold --set image.repository=bringes/app1 --set image.tag=XXXXX]  subtask=3 task=Render
...

Running command: [helm --kube-context cluster.local dep build skaffold-bug/charts/app] subtask=1 task=Deploy Running command: [helm --kube-context cluster.local dep build skaffold-bug/charts/app] subtask=2 task=Deploy Running command: [helm --kube-context cluster.local dep build skaffold-bug/charts/app] subtask=3 task=Deploy ...


It might be perfect to have an option to group certain requirements in a group which will be deployed concurrently , something like:
```yaml
apiVersion: skaffold/v3
kind: Config
requires:
  - concurrency: 3
    paths: 
     - ./app1/skaffold.yml
     - ./app2/skaffold.yml
     - ./app3/skaffold.yml
     - ./app4/skaffold.yml
  - path: ./app5/skaffold.yml

Obs! It's pretty important to be able to deploy certain requirements synchronously. Obs! For rendering the helm charts, maybe we could execute multiple tasks in parallel by default..

davidgwps commented 1 year ago

This feature would be greatly appreciated, we have a monorepo with 10+ workloads that we are deploying together to aid local development and avoid deploying ephemeral environments into GKE. The lack of concurrency is a major headache as deployments being sequential results in us waiting 10+ minutes for initial deployment by Skaffold into Minikube and 5+ minutes for iteration.

The majority of the team feedback that this is the blocker to using Skaffold full time.

nickdapper commented 1 year ago

@ericzzzzzzz Is there a way to scope this out? Would like to see if I can do this. We have over 40 microservices and need to deploy quickly.

nickdapper commented 1 year ago

Also related #5417

ericzzzzzzz commented 1 year ago

Hi @nickdapper, glad to hear that you're interested in working on this! I'll post some context later!

ericzzzzzzz commented 1 year ago

Hi @nickdapper , sorry for the late reply. We have implementation for concurrent build https://github.com/GoogleContainerTools/skaffold/blob/88e1c1734ea0f1bd79a0f24783f9565057a0e46a/pkg/skaffold/build/scheduler.go#L146-L157, I think the same approach can be apply to render, deploy as well. We would also like to have the similar flag to build-concurrency for render/deploy https://github.com/GoogleContainerTools/skaffold/blob/c3d15a41a62fbb08096a971795a1a0b5367feaa2/cmd/skaffold/app/cmd/flags.go#L621-L628

One thing a little bit challenging is to build dependency graphs for renderers and deployers, unlike build stage where artifacts dependency relationship is defined under https://github.com/GoogleContainerTools/skaffold/blob/7d346f54c50b37860d03fbebaf32b0180dfcf515/pkg/skaffold/schema/latest/config.go#L976-L977 build.artifacts.dependencies stanza directly, we need to build dependency graph based on https://github.com/GoogleContainerTools/skaffold/blob/7d346f54c50b37860d03fbebaf32b0180dfcf515/pkg/skaffold/schema/latest/config.go#L115-L122, note that the name field is an optional field, this is not reliable for build the graph.

After gathering the data we need, we can start dispatch render jobs and deploy jobs from https://github.com/GoogleContainerTools/skaffold/blob/88e1c1734ea0f1bd79a0f24783f9565057a0e46a/pkg/skaffold/render/renderer/render_mux.go#L61-L78 and https://github.com/GoogleContainerTools/skaffold/blob/88e1c1734ea0f1bd79a0f24783f9565057a0e46a/pkg/skaffold/deploy/deploy_mux.go#L120-L139

Aside the work I mentioned, we may need a little more investigation, I think it's good to start with a design doc. If you feel this may take too much time from you, we can assign this to our team member, but this won't be a top priority in the near future milestones.

aaron-prindle commented 1 year ago

@nickdapper definitely let us know how we can help support you with any work here. The Skaffold team would be super excited to help with any work on this, thanks!

nickdapper commented 1 year ago

@ericzzzzzzz Thanks for the context here. Would we apply this to delete as well?

I agree about the dependency graph. I'm not sure a top level concurrency flag will be enough though.

Example:

apiVersion: skaffold/v3
kind: Config
metadata:
  name: my app
requires:
  - path: deps/postgres/skaffold.yaml
  - path: deps/redis/skaffold.yaml
  - path: cmd/service-1/skaffold.yaml
  - path: cmd/service-2/skaffold.yaml

In this scenario, I would want a way to define that the deps are deployed in parallel however it should wait until those are up and running before proceeding to service-1 and service-2 which those also could be parallel. This means we need a concurrency flag in the skaffold.yaml file for the dependency graph to utilize.

Here's an example of what I'm thinking. Not sure if this should be defined under the profile or somewhere else. I didn't put it inside deploy because this could apply to delete as well.

Here concurrency would be defined as (0 or -1) == max, 1== sequential, >1== only go up to the limit with respecting the graph.

apiVersion: skaffold/v3
kind: Config
metadata:
  name: my-app
requires:
  - path: deps/skaffold.yaml
  - path: cmd/skaffold.yaml    
profiles:
  - name: local
    concurrency: 1

--- 
apiVersion: skaffold/v3
kind: Config
metadata:
  name: deps
requires:
  - path: postgres/skaffold.yaml
  - path: redis/skaffold.yaml
profiles:
  - name: local
    concurrency: 0

--- 
apiVersion: skaffold/v3
kind: Config
metadata:
  name: services
requires:
  - path: service-1/skaffold.yaml
  - path: service-1/skaffold.yaml
profiles:
  - name: local
    concurrency: 0

This would mean, deploy deps and services sequentially. For deps, deploy all in parallel. For services, deploy all in parallel.

Thoughts on this approach?

iosifnicolae2 commented 1 year ago

@nickdapper your solution might also work for us.

gsquared94 commented 1 year ago

I think the concurrency field needs to be set at build, render and deployer stage individually. So something like:

apiVersion: skaffold/v3
kind: Config
build:
  ~~
  concurrency: 0
manifests:
  ~~
  concurrency:1
deploy:
  ~~
  concurrency:0

Also, for build there's an obvious number of artifacts so setting an integer concurrency level makes more sense than in render or deploy where it might not be apparent how many resources are exactly being worked on and what should concurrency control. In that case, should it just be something like:

apiVersion: skaffold/v3
kind: Config
build:
  ~~
  concurrency: 0
manifests:
  ~~
  runParallel: true
deploy:
  ~~
  runParallel: false

iosifnicolae2 commented 1 year ago

Any idea if this functionality will be implemented?

altais commented 1 year ago

Same issue here, would love to see this feature implemented

matake01 commented 9 months ago

+1

netcompany-runeviumsondergaard commented 3 months ago

+1

mepi262 commented 6 days ago

@ericzzzzzzz Any update on this issue?

GoogleContainerTools / skaffold