Closed trevorlinton closed 5 years ago
Thanks for the writeup.
Another possible solution to reduce the problem of the broker being 'over polled' could be, the broker returns a timestamp in the last operation response after which the platform should poll the broker again.
IBM to see if they have any need for this
Would the broker telling the platform when to poll next help?
Returning a timestamp or when to poll next does help with congestion but doesn't help resolve the problem of holding state by the client (and potentially orphaning resources if the client crashes).
As the broker must already store state (e.g., what is being provisioned), it's fairly trivial for the broker to also store a callback url (and secret) and subsequently call the url when finished. The addition of this functionality would greatly ease the burden of developing clients and I believe would increase its adoption, albeit I'm programming a client, so I'm a bit biased in my assessment.
At a very minimum adding a timestamp is a great addition regardless, but a reactive approach (one that allows the broker to communicate back with the client after extended periods of time) would be helpful.
I checked with our product guys and we definitely want the "when to poll next timestamp" feature. As for the call-back, they like that one as well.
The callback has come up a few times with the Automation Broker (aka Ansible Service Broker), but our broker doesn't know "where" the platform is to return the call, i.e. the url would have to be a system that is accessible to the broker. If that's the case then I don't see why it couldn't. I would definitely make it optional.
From a broker authors perspective, we would assume that the uri given is accessible otherwise messages will effectively get dropped. If you are losing messages fix the firewall to allow the broker to access the uri.
This isn't a feature we are desperate to have, but I see the value in it. +1 to webhook
As far as alleviating the polling, timestamp might be useful especially with longer provisions. On a different project we had these long provisions that we were polling so we started with a long poll interval to allow the provision to get as far as possible, then did shorter and shorter intervals as we knew it would be close. This helped with the bombardment of requests.
So +1 to the timestamp as well.
@tinygrasshopper is going to have a go at putting a PR together that solves this problem by allowing service brokers to return a timestamp indicating when a Platform should next request the status of an async operation.
We believe #621 should resolve this issue, so will track progress over there
Closing as we believe #621 will help here and the desire for having a webhook seems to be low. Please reopen if I'm wrong though!
Purpose
When making a provision or bind call to a OSB API that is asynchronous clients must then call the
last_operation
end point in a polling fashion to know when the request has succeeded or failed. This creates a few un-necessary problems.The first is the client must retain state about what it was doing and what to do next which, should the client making the provision request not persist this information and state and crash, restart or encounter a redeploy the provisioned resource is orphaned. Considering large windows for asynchronous provisioning can sometimes be up to 30 minutes this provides a large burden for clients to retain considerable state.
Second, if the broker service receives quite a large amount of client requests for provisions it may be overwhelmed with polling operations to
last_operation
end points.This could be alleviated (or at least given alternate workflows for clients) by supporting the concept of an optional url callback where the client during a async provision or async create binding request provides a url where results of this operation should be sent. When the operation (successful or not) completes the url is called with the information that would have been normally returned had the caller made the provision or create binding call synchronously (NOT the
last_operation
end point).Rationale
Not all systems or brokers are truly platform aware, consider a broker which is designed to issue databases via REST interface, itself may not do the binding operations (but internally keep track of who is using the provisioned database), a client may be the calling platform to a broker to create a new resource, then capture those credentials during the binding phase and use them to support applications on its platform. A broker may infact support multiple platforms (think an Azure/AWS/GCloud type system that may have a generic OSB database provisioner).
Since brokers in these scenarios (during binding or provision requests) have no real knowledge of the application they cannot reliably clean up an operation that may have been abandoned due to an intermediate failure by the client during the provision or binding window, they also rely on the client to continue to poll which can lead to wasteful operations or significant delays in provisioning should a clients poll interval be misconfigured and considerably too long.
Design
On
PUT /v2/service_instances/:instance_id
orPUT /v2/service_instances/:instance_id/service_bindings/binding_id
ifaccepts_incomplete=true
is passed, an optionalwebhook
andsecret
may be provided. The webhook parameter MUST be a valid URI-encodedhttp
orhttps
URI. Thesecret
MUST be provided if thewebhook
parameter is provided and MUST be text no more than 32 bytes used by the client to validate requests coming from the broker to the client. The secret SHOULD be unique to each provision or create binding call.Brokers would be required to perform an http or https
POST
with the results of the completed operation (error or not) to the URI in the webhook parameter. The http operation MUST have thecontent-type: application/json
header.To validate the request came from the broker by the client (and to prevent reply attacks) the secret is NEVER passed (in any form) to the webhook url. A http header
x-osb-signature
MUST be provided in the webhook http or httpsPOST
the value of which is theSHA-256
HMAC of the serialized payload. The raw binary HMAC is encoded in base64 and NOT hexadecimal representation, nor should it be a base64 encoding of a hexadecimal string.The result of the webhook operation (http status code, headers, etc) is ignored regardless if an error is returned by the webhook destination. The body of the response of the webhook is also ignored by the broker and highly recommended to not be processed or downloaded.
Considerations
secret
and get rid of it for simplicity. This does create some complexity though for brokers as they may have to keep track of the authorization. A more complex "hash" of the authorization could be kept instead, but this then significantly complicates the verification by the client.