Closed poblin-orange closed 4 years ago
Symptom when concurrent provisionning request is sent by K8S
2020-02-20T17:28:00.44+0100 [APP/PROC/WEB/0] OUT org.cloudfoundry.client.v2.ClientV2Exception: CF-ServiceInstanceNameTaken(60002): The service instance name is taken: 95a4a8e5-52df-4810-82f0-d04366efbd5c
2020-02-20T17:28:00.44+0100 [APP/PROC/WEB/0] OUT at org.cloudfoundry.reactor.util.ErrorPayloadMapper.lambda$null$0(ErrorPayloadMapper.java:46) ~[cloudfoundry-client-reactor-3.16.0.RELEASE.jar!/:na]
2020-02-20T17:28:00.44+0100 [APP/PROC/WEB/0] OUT at org.cloudfoundry.reactor.util.ErrorPayloadMapper.lambda$null$9(ErrorPayloadMapper.java:98) ~[cloudfoundry-client-reactor-3.16.0.RELEASE.jar!/:na]
2020-02-20T17:28:00.44+0100 [APP/PROC/WEB/0] OUT at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:118) ~[reactor-core-3.2.12.RELEASE.jar!/:3.2.12.RELEASE]
2020-02-20T17:28:00.44+0100 [APP/PROC/WEB/0] OUT at reactor.core.publisher.FluxOnAssembly$OnAssemblySubscriber.onNext(FluxOnAssembly.java:345) ~[reactor-core-3.2.12.RELEASE.jar!/:3.2.12.RELEASE]
2020-02-20T17:28:00.44+0100 [APP/PROC/WEB/0] OUT at reactor.core.publisher.FluxSwitchIfEmpty$SwitchIfEmptySubscriber.onNext(FluxSwitchIfEmpty.java:67) ~[reactor-core-3.2.12.RELEASE.jar!/:3.2.12.RELEA
Desired approach:
This is costly to do because the StateRepository is not currently accessible to CloudFoundryDeployer but only the WorkflowServiceInstanceService
Alternative more affordable degraded approach
=> we sacrifice the cases where
Pb with initial implementation in https://github.com/orange-cloudfoundry/osb-cmdb-spike/pull/27/commits/ec163e71aa20cc0eaaad570c87cbb1392d93b909: race condition between servlet threads (reading the state repository) and reactor threads (writing to state repository async)
Alternatives:
Option 1) was successfully tested on openshift
Same problem arises with service bindings:
org.cloudfoundry.client.v2.ClientV2Exception: CF-ServiceKeyNameTaken(360001): The service key name is taken: a6c72b10-9749-4395-b257-bfab24c49d53
Log pollution of level ERROR:
19:02:33.230: [APP/PROC/WEB.0] 2020-02-26 18:02:33.229 ERROR 27 --- [ry-client-nio-3] o.s.c.appbroker.deployer.DeployerClient : Error creating backing service key a6c72b10-9749-4395-b257-bfab24c49d53 for service 6a9aac92-a6e4-4141-a1c9-15dc0cf577bb with error 'CF-ServiceKeyNameTaken(360001): The service key name is taken: a6c72b10-9749-4395-b257-bfab24c49d53'
[...]
19:02:33.238: [APP/PROC/WEB.0] 2020-02-26 18:02:33.235 ERROR 27 --- [ry-client-nio-3] d.DefaultBackingServicesProvisionService : Error creating backing services keys [BackingServiceKey{serviceInstanceName='6a9aac92-a6e4-4141-a1c9-15dc0cf577bb', name='a6c72b10-9749-4395-b257-bfab24c49d53', plan='null', parameters={}, properties={keepTargetSpaceOnDelete=true, target=p-mysql}, parametersTransformers=[], rebindOnUpdate=false}] with error 'CF-ServiceKeyNameTaken(360001): The service key name is taken: a6c72b10-9749-4395-b257-bfab24c49d53'
19:02:33.238: [APP/PROC/WEB.0] 2020-02-26 18:02:33.237 ERROR 27 --- [ry-client-nio-3] ppDeploymentCreateServiceBindingWorkflow : Error creating backing services keysfor p-mysql/10mb with error 'CF-ServiceKeyNameTaken(360001): The service key name is taken: a6c72b10-9749-4395-b257-bfab24c49d53'
19:02:33.245: [APP/PROC/WEB.0] 2020-02-26 18:02:33.244 ERROR 27 --- [ry-client-nio-3] s.c.ServiceBrokerWebFluxExceptionHandler : Unknown exception handled:
The service binding specs mentions at https://github.com/openservicebrokerapi/servicebroker/blob/master/spec.md#binding in the response-6 section:
Status Code Description 200 OK SHOULD be returned if the Service Binding already exists and the requested parameters are identical to the existing Service Binding. The expected response body is below. 201 Created MUST be returned if the Service Binding was created as a result of this request. The expected response body is below. 202 Accepted MUST be returned if the binding is in progress. The operation string MUST match that returned for the original request. This triggers the Platform to poll the Polling Last Operation for Service Bindings endpoint for operation status. Information regarding the Service Binding (i.e. credentials) MUST NOT be returned in this response. Note that a re-sent PUT request MUST return a 202 Accepted, not a 200 OK, if the Service Binding is not yet fully created.
However, the osb-cmdb service binding are not yet asynchronous (nor is there a scab support for async service bindings), hence returning a 202 for dup binding requests isn't an option.
Possible alternatives:
org.cloudfoundry.client.v2.ClientV2Exception: CF-ServiceKeyNameTaken(360001)
Opted to 2) Keep returning a 500 error and ignore log pollution
See related openshift 3.9 documentation at https://docs.openshift.com/container-platform/3.9/architecture/service_catalog/index.html
Pointers and procedure to refresh service catalog in Openshift:
svcat completion --help
Procedure to register a service broker with basic auth (see inspiration)
# enable oc completion
$ source <(oc completion bash)
$ kubectl create secret generic osb-cmdb-0-auth \
--from-literal username=redacted-user \
--from-literal password=redacted-password
# Alternatively in a spec file as documented into https://kubernetes.io/docs/concepts/configuration/secret/#creating-a-secret-manually
$ cat secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: osb-cmdb-0-auth
type: Opaque
data:
username: *****=
password: ********
$ kubectl apply -f ./secret.yaml
secret "osb-cmdb-0-auth" created
# Seems to fail to make the broker appear in openshift UI
$ svcat register osb-cmdb-0-broker \
--url https://ocb-cmdb-0.redacted-domain \
--scope cluster \
--basic-secret osb-cmdb-0-auth
--skip-tls
# Alternatively
$ cat broker.yml
apiVersion: servicecatalog.k8s.io/v1beta1
kind: ClusterServiceBroker
metadata:
name: osbcmdb-broker
spec:
url: https://osb-cmdb-broker-0.redacted-domain
insecureSkipTLSVerify: true
authInfo:
basic:
secretRef:
namespace: interco-kermit-fp
name: osb-cmdb-0-auth
$ oc create -f broker.yml
clusterservicebroker "osbcmdb-broker" created
Other diagnostic commands
$ svcat get brokers
$ svcat describe broker osb-cmdb-0-broker
[...]
Status: Ready - Successfully fetched catalog entries from broker @ 2020-04-09 15:35:47 +0000 UTC
$ svcat get classes
# Errors
$ svcat get class p-mysql-osb --scope all
Error: class 'p-mysql-osb' not found in cluster scope
# works, filter by date
$ oc get ClusterServiceClass
[...]
e1c17bcf-011d-4a34-9ee3-357a28cd7c77 15h
f8fdfaa9-fdc1-4b68-8791-e55a1419a360 15h
# corresponds to p-mysql-osb"
$ oc get ClusterServiceClass --output json f8fdfaa9-fdc1-4b68-8791-e55a1419a360
[...]
"externalName": "p-mysql-osb",
svcat get plans
svcat get plans -o json
# troubleshoot invalid auth in secrets
$ oc get Secret osbcmdb-broker-5-auth-secret --namespace=interco-kermit-fp-demo --output=yaml | grep password
password: redacted-base64-password
$ echo -n "redacted-base64-password" | base64 --decode | hexdump
Cleanup/recovery/purge commands: See related https://github.com/kubernetes-sigs/service-catalog/issues/2268
$ svcat get instances
NAME NAMESPACE CLASS PLAN STATUS
+---------------+--------------------------------+---------+------+-----------------------------------------+
p-mysql-ffkv6 cloudfoundry-service-instances p-mysql 10mb DeprovisionBlockedByExistingCredentials
p-mysql-wbv8r cloudfoundry-service-instances p-mysql 10mb DeprovisionBlockedByExistingCredentials
$ svcat deprovision p-mysql-ffkv6 --abandon
This action is not reversible and may cause you to be charged for the broker resources that are abandoned. If you have any bindings for this instance, please delete them manually with svcat unbind --abandon --name bindingName
Are you sure? [y|n]:
y
Error: deprovision request failed (serviceinstances.servicecatalog.k8s.io "p-mysql-ffkv6" not found)
$ svcat get instances
NAME NAMESPACE CLASS PLAN STATUS
+---------------+--------------------------------+---------+------+-----------------------------------------+
p-mysql-wbv8r cloudfoundry-service-instances p-mysql 10mb DeprovisionBlockedByExistingCredentials
Started an upstream PR in https://github.com/spring-cloud/spring-cloud-app-broker/pull/343 but there is remaining effort needed to diagnose test failure. Pausing for now as I'm out of budget for this upstream PR.
Expected behavior
Expecting Openshift service catalog to be able to consume services fronted by OSBCMDB.
Observed behavior
Openshift issues multiple PUT to create a service instance. Seems compliant with OSB spec.
.https://github.com/kubernetes-sigs/service-catalog/issues/1639
The broker rejects the duplicate PUT, giving provisionning error on OpenShift v3.9.51 ( Kubernetes v1.9.1+a0ce1bc657 )
cc @pdechamboux