kubernetes-retired / service-catalog

Consume services in Kubernetes using the Open Service Broker API
https://svc-cat.io
Apache License 2.0
1.05k stars 385 forks source link

Ensure no other operations can start during async processes. #781

Closed vaikas closed 7 years ago

vaikas commented 7 years ago

According to the OSB spec, when there is an outstanding asynchronous request for a given Service Instance, no other operations (though looks like Provision might, wording is a bit tricky) can start against that service instance. We need to make sure this is the case and implement checks if some are missing.

pmorie commented 7 years ago

no other operations (though looks like Provision might, wording is a bit tricky) can start.

I understood this as being scoped to a particular instance. Do you understand differently? AFAIK async operation only prevents you from being able to do anything else that involves a given instance.

vaikas commented 7 years ago

No, that's what I was trying to say with the for a given service instance. I'll try to clarify.

pmorie commented 7 years ago

Ok, cool. Just wanted to make sure I hadn't read it wrong :)

On Mon, May 1, 2017 at 2:12 PM, Ville Aikas notifications@github.com wrote:

No, that's what I was trying to say with the for a given service instance. I'll try to clarify.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/kubernetes-incubator/service-catalog/issues/781#issuecomment-298391240, or mute the thread https://github.com/notifications/unsubscribe-auth/AAWXmEUT2gg_bieg7GOw01TslscPPJKEks5r1iB6gaJpZM4NNPGR .

vaikas commented 7 years ago

Nope, lock is per service instance. :+1:

nilebox commented 7 years ago

Might be a bit offtopic, but I think it's still relevant to the issue.

I can understand the requirement for per-instance lock to make the result more predictable (to prevent concurrent conflicting changes which could lead to different result due to race condition, for example).

On the other hand, I don't get why OSB spec has even stronger requirements to the marketplace (i.e. Service Catalog in our case) instead of requiring Service Broker to provide idempotency guarantees.

For example (from the Orphans section):

To mitigate orphan instances and bindings, the marketplace SHOULD attempt to delete resources it cannot be sure were successfully created, and SHOULD keep trying to delete them until the broker responds with a success.

Why marketplace doesn't retry provisioning a resource (or creating a binding) instead of deleting a potential orphan? Marketplace does provide instanceId to Service Broker, so it's easy to implement idempotent request handling in Service Broker without any changes in API.

I think this principle of deleting orphans is against the Kubernetes pattern of reconciliation loop, i.e. continuously retrying to make the actual state conform to the declared desired state.

Also, what is the proper implementation of this requirement in Service Catalog? Let's say that provisioning request to Service Broker fails (times out). According to OSB spec, we have to delete the potential orphan. But what's next? Do we update the status of Service Catalog Instance object to "failed" and let the user deal with it, or do we continuously retry to CREATE -> DELETE -> CREATE -> ... until we succeed?

duglin commented 7 years ago

@nilebox can you open an issue in the OSB API spec ( https://github.com/openservicebrokerapi/servicebroker ) for this. I think you're correct and that orphan section should be reworked. IMO, a timeout should not be interpreted the same as a 5xx error. Replaying a createInstance() request upon a timeout should be able to return a 2xx just fine and things should keep working.

arschles commented 7 years ago

Pushing this to 0.1.0, since it is not addressed and has an external dependency

pmorie commented 7 years ago

I believe this is at least partially addressed. For bindings, we check that the instance is ready before doing anything that contacts the broker.

duglin commented 7 years ago

@vaikas-google what's the status of this? I seem to remember discussing async ops in OSB and how the spec might be totally correct today and we might change it, but I can't recall if we ever resolved it. Is there a design point we need to discuss in svc-cat for this?

pmorie commented 7 years ago

I am pretty certain at this point that this is already implemented - do you agree @vaikas-google ?

vaikas commented 7 years ago

Yes, this should be closed until the spec changes or async bindings need to get implemented. Reopen if you disagree :)