cloudfoundry-incubator / bosh-oracle-cpi-release

Other
2 stars 2 forks source link

attachVolumeConflict error while updating concourse deployment #33

Open dmutreja opened 6 years ago

dmutreja commented 6 years ago

From @dmutreja on February 27, 2018 19:13

Encountered "Error: CPI error 'Bosh::Clouds::CloudError' with message 'Error attaching volume: [POST /volumeAttachments/][409] attachVolumeConflict &{Code:0xc4204418e0 Message:0xc420441900}' in 'attach_disk' CPI method"

Logging this issue for further investigation.

Copied from original issue: oracle/bosh-oracle-cpi#23

dmutreja commented 6 years ago
dmutreja commented 6 years ago

@dmutreja Commented

ubuntu@bosh-cli:~/bosh/deployments/concourse$ bosh2 deploy -d concourse concourse.yml  -l ../bosh-deployment/director-env-vars.yml --vars-store ../bosh-deployment/creds.yml -o credhub-tls-cert-auth.yml
Using environment '10.0.3.2' as client 'admin'

Using deployment 'concourse'

  instance_groups:
  - name: web
    jobs:
    - name: atc
      properties:
+       credhub:
+         client_id: "<redacted>"
+         client_secret: "<redacted>"
+         tls:
+           ca_cert: "<redacted>"
+         url: "<redacted>"

Continue? [yN]: y

Task 191

Task 191 | 19:02:16 | Preparing deployment: Preparing deployment (00:00:00)
Task 191 | 19:02:16 | Preparing package compilation: Finding packages to compile (00:00:00)
Task 191 | 19:02:16 | Updating instance db: db/46565c97-89ba-42bd-a958-f137f87c0491 (0) (canary)
Task 191 | 19:02:16 | Updating instance worker: worker/9fea972c-c77e-4bda-a2ef-ea92dbf963d0 (0) (canary)
Task 191 | 19:02:16 | Updating instance web: web/67d0e0a7-375b-482e-9082-58c1a1c480e6 (0) (canary) (00:02:11)
Task 191 | 19:05:47 | Updating instance db: db/46565c97-89ba-42bd-a958-f137f87c0491 (0) (canary) (00:03:31)
                    L Error: CPI error 'Bosh::Clouds::CloudError' with message 'Error attaching volume: [POST /volumeAttachments/][409] attachVolumeConflict  &{Code:0xc4204418e0 Message:0xc420441900}' in 'attach_disk' CPI method
Task 191 | 19:09:12 | Updating instance worker: worker/9fea972c-c77e-4bda-a2ef-ea92dbf963d0 (0) (canary) (00:06:56)
Task 191 | 19:09:12 | Error: CPI error 'Bosh::Clouds::CloudError' with message 'Error attaching volume: [POST /volumeAttachments/][409] attachVolumeConflict  &{Code:0xc4204418e0 Message:0xc420441900}' in 'attach_disk' CPI method

Task 191 Started  Tue Feb 27 19:02:16 UTC 2018
Task 191 Finished Tue Feb 27 19:09:12 UTC 2018
Task 191 Duration 00:06:56
Task 191 error

Updating deployment:
  Expected task '191' to succeed but state is 'error'

Exit code 1
dmutreja commented 6 years ago

@dmutreja Commented bosh cck after the failure reports missing db vm as expected. Recreated the vm (option 4 in bosh cck output) and noticed in OCI console that the new VM is successfully attached to the existing storage volume, yet the CPI reports the same attachment error.

dmutreja commented 6 years ago

@dmutreja Commented Looks like director issued two calls to attach_disk for the same vm and disk in the same task. The first one succeeded but the second one failed because the disk(storage volume) was already attached. Debug output from task 207 (created by bosh cck) has:

External CPI sending request: {"method":"create_vm","arguments":["cda0c2dc-21e1-46ba-aa54-ff7f2e3c4e06","ocid1.image.oc1.phx.aaaaaaaacstmldcbolhbdin65itukjnhpactsq36y3xl6bvc2wv3zaif5roq",{"availability_domain":"<redacted>","instance_shape":"<redacted>"},{"default":{"ip":"10.0.11.3","netmask":"255.255.255.0","cloud_properties":{"subnet_name":"concourse-subnet-ad3","vcn":"bosh_vcn"},"default":["dns","gateway"],"dns":["8.8.8.8"],"gateway":"10.0.11.1"}},["ocid1.volume.oc1.phx.abyhqljrnbqwzyavm7jc27ekqbd4rovkyk2d47lsricmx3y6udtobyums5ta"],{"bosh":{"password":"<redacted>","group":"bosh-director-concourse-db","groups":["bosh-director","concourse","db","bosh-director-concourse","concourse-db","bosh-director-concourse-db"]}}],"context":{"director_uuid":"bb607c46-f076-4ea9-a68e-a97a2b1d7bc8","request_id":"576658"}} with command: /var/vcap/jobs/oracle_cpi/bin/cpi
D, [2018-02-27 19:28:27 #19103] [task:207] DEBUG -- DirectorJobRunner: External CPI got response: {"result":"ocid1.instance.oc1.phx.abyhqljrxhckyqbbhchiu77nni5vzc3tyomx6vxwllb43calq2r4josn43pq","error":null,"log":""}, err: , exit_status: pid 19129 exit 0
D, [2018-02-27 19:28:27 #19103] [task:207] DEBUG -- DirectorJobRunner: External CPI sending request: {"method":"set_vm_metadata","arguments":["ocid1.instance.oc1.phx.abyhqljrxhckyqbbhchiu77nni5vzc3tyomx6vxwllb43calq2r4josn43pq",{"director":"bosh-director","deployment":"concourse","id":"46565c97-89ba-42bd-a958-f137f87c0491","job":"db","index":"0","name":"db/46565c97-89ba-42bd-a958-f137f87c0491","created_at":"2018-02-27T19:28:27Z"}],"context":{"director_uuid":"bb607c46-f076-4ea9-a68e-a97a2b1d7bc8","request_id":"468940"}} with command: /var/vcap/jobs/oracle_cpi/bin/cpi
D, [2018-02-27 19:28:58 #19103] [task:207] DEBUG -- DirectorJobRunner: External CPI got response: {"result":null,"error":null,"log":""}, err: , exit_status: pid 19170 exit 0
D, [2018-02-27 19:29:33 #19103] [task:207] DEBUG -- DirectorJobRunner: External CPI sending request: {"method":"attach_disk","arguments":["ocid1.instance.oc1.phx.abyhqljrxhckyqbbhchiu77nni5vzc3tyomx6vxwllb43calq2r4josn43pq","ocid1.volume.oc1.phx.abyhqljrnbqwzyavm7jc27ekqbd4rovkyk2d47lsricmx3y6udtobyums5ta"],"context":{"director_uuid":"bb607c46-f076-4ea9-a68e-a97a2b1d7bc8","request_id":"259075"}} with command: /var/vcap/jobs/oracle_cpi/bin/cpi
D, [2018-02-27 19:29:41 #19103] [task:207] DEBUG -- DirectorJobRunner: External CPI got response: {"result":null,"error":null,"log":""}, err: , exit_status: pid 20500 exit 0
D, [2018-02-27 19:29:41 #19103] [task:207] DEBUG -- DirectorJobRunner: External CPI sending request: {"method":"attach_disk","arguments":["ocid1.instance.oc1.phx.abyhqljrxhckyqbbhchiu77nni5vzc3tyomx6vxwllb43calq2r4josn43pq","ocid1.volume.oc1.phx.abyhqljrnbqwzyavm7jc27ekqbd4rovkyk2d47lsricmx3y6udtobyums5ta"],"context":{"director_uuid":"bb607c46-f076-4ea9-a68e-a97a2b1d7bc8","request_id":"677187","apikey":"<redacted>","compartment":"<redacted>","fingerprint":"<redacted>","region":"<redacted>","tenancy":"<redacted>","user":"<redacted>"}} with command: /var/vcap/jobs/oracle_cpi/bin/cpi
D, [2018-02-27 19:29:42 #19103] [task:207] DEBUG -- DirectorJobRunner: External CPI got response: {"result":null,"error":{"type":"Bosh::Clouds::CloudError","message":"Error attaching volume: [POST /volumeAttachments/][409] attachVolumeConflict  \u0026{Code:0xc4203641c0 Message:0xc4203641d0}","ok_to_retry":false}, [snip]
D, [2018-02-27 19:29:42 #19103] [task:207] DEBUG -- DirectorJobRunner: External CPI sending request: {"method":"delete_vm","arguments":["ocid1.instance.oc1.phx.abyhqljrxhckyqbbhchiu77nni5vzc3tyomx6vxwllb43calq2r4josn43pq"],"context":{"director_uuid":"bb607c46-f076-4ea9-a68e-a97a2b1d7bc8","request_id":"979665"}} with command: /var/vcap/jobs/oracle_cpi/bin/cpi
D, [2018-02-27 19:30:39 #19103] [task:207] DEBUG -- DirectorJobRunner: External CPI got response: {"result":null,"error":null,"log":""}, err: , exit_status: pid 20711 exit 0
D, [2018-02-27 19:30:39 #19103] [task:207] DEBUG -- DirectorJobRunner: External CPI sending request: {"method":"delete_vm","arguments":["ocid1.instance.oc1.phx.abyhqljrxhckyqbbhchiu77nni5vzc3tyomx6vxwllb43calq2r4josn43pq"],"context":{"director_uuid":"bb607c46-f076-4ea9-a68e-a97a2b1d7bc8","request_id":"931913","apikey":"<redacted>","compartment":"<redacted>","fingerprint":"<redacted>","region":"<redacted>","tenancy":"<redacted>","user":"<redacted>"}} with command: /var/vcap/jobs/oracle_cpi/bin/cpi
D, [2018-02-27 19:30:40 #19103] [task:207] DEBUG -- DirectorJobRunner: External CPI got response: {"result":null,"error":null,"log":""}, err: , exit_status: pid 22057 exit 0

Notice two calls to attach_disk right after one another The first one succeeds and the second one fails.