cloudfoundry-incubator / bosh-oracle-cpi-release

Other
2 stars 2 forks source link

Intermittent failure in ocitests suite running in release-pipeline #25

Closed dmutreja closed 6 years ago

dmutreja commented 6 years ago

--- FAIL: Test_VmOpsUpdateMultipleInstancesConcurrently (449.87s) assertions.go:34: Unexpected failure in updateInstance no consumer: "text/html"

dmutreja commented 6 years ago
dmutreja commented 6 years ago

@dmutreja Commented The error seems to be coming from https://github.com/oracle/bosh-oracle-cpi-release/blob/cbdb473da8f12bb1de916e0e14d8b4de5896b9ad/src/github.com/oracle/bosh-oracle-cpi/vendor/github.com/go-openapi/runtime/client/runtime.go#L311 which reads the content type from the http response header (for the updateInstance request) and doesn't know how to handle content type "text/html".

Unclear why the compute service returns that content type intermittently. The issue has showed up twice in 7 days or so. Perhaps we could setup the suite to run with openapi client Debug enabled that will dump full headers for further investigation.

dmutreja commented 6 years ago

@dmutreja Commented The problem actually appears to be in the test code.

Commit https://github.com/oracle/bosh-oracle-cpi-release/commit/57541e126712e337c3f5ab7c05d27996a471e18f cleaned up the TerminateInstance logic which results in a cleaner debug output.

This error occurred again in build 297 of develop branch, but this time the test debug output is much cleaner. The test debug log has:

1324 [OCIVMOperations] 2018/03/19 05:52:25 INFO - Deleting VM ocid1.instance.oc1.phx.abyhqljt2x4faoazvflzdlts5yv5in2hneyxcnpd3vokx3ua6cbko3qap56a...^M
1325 [OCIVMOperations] 2018/03/19 05:52:25 DEBUG - Waiting for VNIC attachment ocid1.vnicattachment.oc1.phx.abyhqljtud6bfyajspzurnnfutkrd4k2btj5gt2b5me7aznm7nvvgysfkcta to be detached...^M

[snip]

1883 --- FAIL: Test_VmOpsUpdateMultipleInstancesConcurrently (453.07s)^M
1884         assertions.go:34: Unexpected failure when updating instance ocid1.instance.oc1.phx.abyhqljt2x4faoazvflzdlts5yv5in2hneyxcnpd3vokx3ua6cbko3qap56a.  Error = [no consumer: "text/html"]^M

which suggests that a delete instance request was issued while the instance was being updated.

Test code has https://github.com/oracle/bosh-oracle-cpi-release/blob/cbdb473da8f12bb1de916e0e14d8b4de5896b9ad/src/github.com/oracle/bosh-oracle-cpi/oci/test/vm_instance_test.go#L192-L197 There is a race between defer state.TearDown (which deletes all instances) and updateInstance Go coroutines updating the instances.

dmutreja commented 6 years ago

@dmutreja Commented https://github.com/oracle/bosh-oracle-cpi-release/commit/46d0e0b4c60e61cfab638c7d4f8e6ad9c25b4227 should fix it. Will reopen if it shows up again.