GoogleContainerTools / skaffold

Easy and Repeatable Kubernetes Development
https://skaffold.dev/
Apache License 2.0
15.04k stars 1.62k forks source link

--iterative-status-check broken in 1.31.0 (modules are always deployed sequentially, and not all at once) #6564

Closed akostic-kostile closed 3 years ago

akostic-kostile commented 3 years ago

Expected behavior

When doing skaffold dev --module one,two,three --iterative-status-check=false I would expect all modules to be deployed at once, and not sequentially.

Actual behavior

Modules are deployed sequentially regardless of value of --iterative-status-check

Information

Can't paste the yaml as it belongs to the company I work for, but the project is composed of 7 modules out of which 4 are being built and deployed, while 3 are third party, and they are only being deployed.

Steps to reproduce the behavior

  1. skaffold dev --module one,two,three
  2. Observe the modules being deployed one after another

This is a regression bug, this worked fine in 1.30.0

aaron-prindle commented 3 years ago

I was able to reproduce this regression using skaffold/examples/multi-config-microservices comparing v1.31.0 and v1.29.0. I saw the same regression in v1.30.0 which is different that what the bug states which is that this started in v1.31.0 so I'm not sure what might be different in the environments, etc.:

Repro steps:

See parallel deploy in v1.29.0 (works as intended):

$ cd skaffold/examples/multi-config-microservices
$ skaffold-v1.29.0 dev --module app-config,web-config --iterative-status-check=false
...<dev output>
Tags used in deployment:
 - base -> base:5463cd546c093eb8af26dc6794a339efe54d3b3a6b04e7433f4b7ab9cedf1380
 - leeroy-app -> leeroy-app:e22b15e750340ae6504e808b7d7197cbe121d315ddb0a6519ae014e40e8327e7
 - leeroy-web -> leeroy-web:e9c010ecf5fc434475fdd43f4b67ee740742d2bec68532f4001a382d82775897
Starting deploy...
 - service/leeroy-app created
 - deployment.apps/leeroy-app created
 - deployment.apps/leeroy-web created
Waiting for deployments to stabilize...
 - deployment/leeroy-app is ready. [1/2 deployment(s) still pending]
 - deployment/leeroy-web is ready.
Deployments stabilized in 2.107 seconds

See iterative deploy in v1.30.0 and v1.31.0 (regression):

$ skaffold-v1.30.0 dev --module app-config,web-config --iterative-status-check=false
...<dev output>
Tags used in deployment:
 - base -> base:fd993570f7ccf2605c30056384cf58d2f39bcaaca52b9fce58ecff975e4a93fe
 - leeroy-app -> leeroy-app:8de748e494e1b87094a18fce38723f879bac383249ff9934f26ccb57167c0430
 - leeroy-web -> leeroy-web:752d2f9b5aaeb9b46270bb87a40ba5157e8e1fb9ab90012309e6335aba01cbfb
Starting deploy...
 - service/leeroy-app created
 - deployment.apps/leeroy-app created
Waiting for deployments to stabilize...
 - deployment/leeroy-app is ready.
Deployments stabilized in 2.111 seconds
 - deployment.apps/leeroy-web created
Waiting for deployments to stabilize...
 - deployment/leeroy-web is ready.
Deployments stabilized in 2.098 seconds
Waiting for deployments to stabilize...
Deployments stabilized in 6.594398ms
$ skaffold-v1.31.0 dev --module app-config,web-config --iterative-status-check=false
...<dev output>
Tags used in deployment:
 - base -> base:76ff994c2aee2882845e2dda1b24e79e9bce8758e010fb6aa0b2cec48c5879dc
 - leeroy-app -> leeroy-app:b484c2779e6f27f159f621ca8e3dfd6c423cee6096668e037b320ebc70da7a3e
 - leeroy-web -> leeroy-web:d4aa291e4144dc54aa225d9d32fce70f69fea82cb89da9985a2f157f79d6d3ee
Starting deploy...
 - service/leeroy-app created
 - deployment.apps/leeroy-app created
Waiting for deployments to stabilize...
 - deployment/leeroy-app is ready.
Deployments stabilized in 2.117 seconds
 - deployment.apps/leeroy-web created
Waiting for deployments to stabilize...
 - deployment/leeroy-web is ready.
Deployments stabilized in 2.096 seconds
Waiting for deployments to stabilize...
Deployments stabilized in 6.776904ms

In poking around this PR may be the culprit, still trying to identify if/why though: https://github.com/GoogleContainerTools/skaffold/pull/6376

This change specifically touches the iterativeStatusCheck logic here: https://github.com/GoogleContainerTools/skaffold/commit/84367d9a5a84552015f649260145d7e0fd315ecb#diff-e2e345820b1fa62f4570a4540a6832aa621e5c90216151e1f8cc283d46919405R125

@gsquared94 can you tell if the change above may have caused this regression?

Kelvin-M commented 3 years ago

Thank you @akostic-kostile for the issue and @aaron-prindle for the investigation ! In my setup I have something like :

deploy:
  helm:
    releases:
      - name: release-name
        chartPath: charts/path/
        valuesFiles: [ examples/values.yaml ]
  kubectl:
    manifests:
      - examples/serviceAccounts.yaml
      - examples/secrets.yaml

With skaffold 1.31.0, I'm not able to do a skaffold run properly as it does iterative status check even if the value is set to false. The helm release deployed depends on kubectl manifests to stabilize. Consequently I have to ignore all status-checks to be able to deploy (skaffold run --status-check=false) It worked previously with 1.29, but I don't remember if there was any issue with 1.30. Hope it will help for your investigation and troubleshooting :)

gsquared94 commented 3 years ago

In poking around this PR may be the culprit, still trying to identify if/why though:

6376

This change specifically touches the iterativeStatusCheck logic here: 84367d9#diff-e2e345820b1fa62f4570a4540a6832aa621e5c90216151e1f8cc283d46919405R125

@gsquared94 can you tell if the change above may have caused this regression?

Yes, this is a regression caused by #6376. I'll send out a fix shortly

akostic-kostile commented 3 years ago

I double checked my findings and I can confirm 1.30.0 works correctly for me. The only meaningful difference I can find with configuration that @aaron-prindle was using was the fact that my config has these 2 set: deploy: statusCheck: true statusCheckDeadlineSeconds: 300 In any case I'm glad we've isolated the issue and @gsquared94 will fix it up. :)