Shippable / support

Shippable SaaS customers can report issues and feature requests in this repository
100 stars 28 forks source link

Job runs triggered by resource updates stuck indefinitely in waiting state #5123

Closed redgoat650 closed 4 years ago

redgoat650 commented 4 years ago

We have a job that is triggered by an update to a resource. It has been working as expected until yesterday. Currently we are seeing any job runs that are triggered by that resource update are sitting in the queue "waiting" indefinitely, even with nodes idle (BYON).

Triggering the job manually works fine; the manual job executes immediately even if there are other runs waiting in the queue from the resource trigger. Clearing the queued "waiting" jobs did not resolve the issue; new triggers still aren't getting picked up for execution.

The YAML for the problematic job looks something like this:

name: problematic-job
type: runCLI
steps:
  - IN: setup-params
  - IN: our_repo
    switch: 'off'
  - IN: resource-1
    switch: 'off'
  - IN: resource-2
    switch: 'off'
  - IN:  resource-3
    switch: 'on'
  - TASK:
      - script: ...

where we intend that the job should be triggered by a change in resource-3.

Strangely, we have two other jobs that trigger in parallel off the same resource that are being picked up and executed as expected. They look like this:

name: working-as-expected-job
type: runCLI
dependencyMode: strict
steps:
  - IN: cluster-params
  - IN: setup-params
  - IN: our_repo
    switch: 'off'
  - IN: resource-3
    switch: 'on'
  - TASK:
      - script: ...

the major differences being

Any suggestions on how to remedy this issue or what might be causing it?

redgoat650 commented 4 years ago

Update: found the reason. A downstream job was stuck processing. Once I stopped the downstream job, this job kicked off. The problematic job was connected to the downstream job, the working-as-expected jobs were not.

http://docs.shippable.com/platform/workflow/job/overview/#dependencymode-chrono

However, the job won't run at the same time as any of its directly connected upstream or downstream jobs.