concourse / pool-resource

atomically manages the state of the world (e.g. external environments)
Apache License 2.0
56 stars 36 forks source link

Error releasing pool resource if build is cancelled before it acquires lock #20

Closed xtremerui closed 8 years ago

xtremerui commented 8 years ago

Hi, we have being observed this behaviour for a while. The user case is sometimes we have many builds pending and those new coming builds will hold on waiting to acquire lock on a resource until the previous build is finished. Since our build takes about 45mins sometimes we just want to manually cancel those old builds (in order to free resource) to let the latest build to run. We then find out those old builds (even concourse show them cancelled already) still put a lock on next available resource and have trouble to release the resource correctly, which ended up a lock on the resource forever until we go to github and manually move it from claimed to unclaimed.

The following screenshot shows build 31 is cancelled when it was waiting for resource then shows error while releasing it. Ideally this should be all good that this 31 build should have no interact with any resource. But refer to next screenshot it was not. screen shot 2016-07-06 at 1 21 26 pm

This shows build 31 actually put a lock on one of our resource (named skor) and never release it. We have to manually release it in this case. screen shot 2016-07-06 at 1 22 18 pm

Following is our job configure:

  - name: deploy-push-tile-internal-17
    plan:
      - do:
        - aggregate:
          - get: pcf-automation
          - get: environment-configuration
          - get: push-pivotal-package
            trigger: true
          - get: gem-credentials
          - get: push-smoke-tests
          - get: push-stemcell-latest
          - put: env-17
            params:
              acquire: true
          - get: push-service-ci
          - get: toronto-ci-credentials
        - task: deploy-env
          file: push-service-ci/pipelines/jobs/tile-deploy.yml
          params:
            CF_VERSION: "1.7"
            PUSH_FILENAME: push-pivotal-package/*.pivotal
            PUSH_STEMCELL_PATH: push-stemcell-latest
        - task: smoke-tests
          file: push-service-ci/pipelines/jobs/run-smoke-tests.yml
        ensure:
          task: delete-tile
          file: push-service-ci/pipelines/jobs/tile-delete.yml
          params:
            CF_VERSION: "1.7"
          ensure:
            put: env-17
            params:
              release: env-17

We are using concourse 1.3.1 and not sure how cancelling one build affects pool resource. Please take a look. Thanks!

concourse-bot commented 8 years ago

Hi there!

We use Pivotal Tracker to provide visibility into what our team is working on. A story for this issue has been automatically created.

The current status is as follows:

This comment, as well as the labels on the issue, will be automatically updated as the status in Tracker changes.

Pirolf commented 8 years ago

@xtremerui seems related to https://www.pivotaltracker.com/story/show/121378389

xtremerui commented 8 years ago

@Pirolf thx pointing it out. Looks like this fix is in concourse 1.4. I will update and check if it works.

xtremerui commented 8 years ago

After update to concourse 1.4 this issue is gone. Cancelled build won't put lock on resource anymore. Closing this issue. Thank you!