cloudfoundry / cloud_controller_ng

Cloud Foundry Cloud Controller
Apache License 2.0
193 stars 359 forks source link

Cloud Controller Retries Service Instance Creation which results in duplicate service instances being provisioned ( BOSH deployments for that instance) #975

Closed kamath-prasad closed 6 years ago

kamath-prasad commented 7 years ago

Thanks for submitting an issue to cloud_controller_ng. We are always trying to improve! To help us, please fill out the following template.

Issue

We are observing that a single "cf create-service" command via CF CLI results in multiple PUT /v2/service_instances/:instance_guid calls to service broker

Context

We are observing that a single "cf create-service" command via CF CLI results in multiple PUT /v2/service_instances/:instance_guid calls to service broker

Steps to Reproduce

Not exactly clear on how to reproduce . But at first analysis , we observe that the issue occurs whenever the broker does not return a HTTP 202 to the PUT /v2/service_instances/:instance_guid call to service-broker.

Expected result

Cloud Controller should not retry automatically .

Current result

Cloud controller retries automatically within a short span which results duplicate BOSH deployments being provisioned on the IAAS corresponding to the same service instance in CC.

Possible Fix

Provide a configuration in CC with which we can disable the retries or maybe increase gap between retries.

name of issue screenshot

2 calls from cc to broker for create 3 calls from cf to service broker for create

Let us know in case you need more details .

cf-gitbot commented 7 years ago

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/152636793

The labels on this github issue will be updated when the story is started.

Gerg commented 7 years ago

Hello @kamath-prasad,

What is the broker returning for the first request? In the logs I seeHTTP/1.1" - -, which doesn't have a response code? Here is the documentation for valid response codes for a service broker provision: https://github.com/openservicebrokerapi/servicebroker/blob/v2.13/spec.md#response-2

Note that brokers are expected to return a 200 OK or 409 Conflict if a service instance with that guid already exists.

kamath-prasad commented 7 years ago

@Gerg : On our initial analysis , we suspect that the call from CC to broker time's out due to which there is blank HTTP code sent back to CC .

Also in the 2nd example I have mentioned above , we see that there are 2 PUT calls from CC to which broker has returned 202 after the first timeout .

However does this result in a retry from CC for the same instance's creation again ?

mattmcneeney commented 6 years ago

@kamath-prasad I can see that you seemed to have updated your broker to not time out since raising this issue. Is this change still of interest to you?

kamath-prasad commented 6 years ago

Yes , we can close this now.