Open ohkyle opened 2 years ago
A potential fix for this bug
.*${job_name}.*[[:space:]]not[[:space:]]monitored[[:space:]]
I have attached a patch as well (since I do not have push rights to the repo) 0001-monit_utils-Add-support-for-more-than-10-cc-workers.txt
Thanks for submitting an issue to
capi-release
. We are always trying to improve! To help us, please fill out the following template.Issue
cloud_controller_worker pre-backup-lock hangs when there are 10 or more cloud_controller_workers
Context
We ran into issues when trying use bbr to backup our cf deployment.
We deployed cf with this ops file:
We observed these logs
On the vm we see
On the vm we also see
Steps to Reproduce
type: replace path: /instance_groups/name=cc-worker/jobs/name=cloud_controller_worker/properties/cc/jobs?/generic?/number_of_workers? value: 11
type: replace path: /instance_groups/name=cc-worker/vm_type value: medium
type: replace path: /releases/- value: name: backup-and-restore-sdk sha1: 238c36f2229f303ebf96f6b24b29799232195e38 url: https://bosh.io/d/github.com/cloudfoundry-incubator/backup-and-restore-sdk-release?v=1.18.52 version: 1.18.52
type: replace path: /instance_groups/- value: azs:
type: replace path: /instance_groups/name=api/jobs/name=routing-api/properties/release_level_backup? value: true
$ bbr deployment --deployment cf backup
... [bbr] 2022/09/13 15:31:55 INFO - Finished locking cloud_controller_clock on scheduler/b468a5cd-130a-442a-bcab-bfbc40cb4ab5 for backup. [bbr] 2022/09/13 15:32:03 INFO - Finished locking cloud_controller_ng on api/dc671209-452f-43c9-8077-3927b614ffad for backup. [bbr] 2022/09/13 15:32:06 INFO - Finished locking cloud_controller_ng on api/5859cb84-76ed-492f-b03f-023532aa352e for backup.
Expected result
We expected the command in step 2 to succeed.
Current result
Currently the command is failing.
Possible Fix
This piece of code
seems to assume that there will always be less than 10 cloud_controller_workers.
On our vm with 11 workers