ietf-wg-gnap / gnap-core-protocol

141 stars 26 forks source link

Interaction Finish Methods - suggestion for a possible enhancement #489

Closed blue-ringed-octopus-guy closed 1 year ago

blue-ringed-octopus-guy commented 1 year ago

As defined in section 2.5.2. of the document, the client is only able to tell the AS that it supports one particular interaction finish method (even if the client supports multiple). So in terms of the interaction finish methods that are currently defined, the client must choose to either tell the AS that it supports either "redirect" or "push" (or omit sending this field if the client doesn't support any interactions).

Section 3.3.5. of the document explains that if the client has provided an interaction finish method then the AS can use this method, as along as: "the AS supports this mode for the client instance's request".

The issue I foresee here is that there might be scenarios where a client may support multiple interaction finish methods, but the client may end up sending a method to the AS that the AS doesn't support (or that the AS doesn't support for this particular request, for whatever reason). If this scenario occurs then the client must fallback to using polling to find out when the request has completed (unless the client has the ability to identify that this finish method isn't support for this request, and is able to send a PATCH request to change it). Polling might not be desirable for the client (or the AS) for a number of reasons, and it also results in the security issues highlighted in section 13.22. of the document.

If the client was able to pass multiple interaction finish methods to the AS, the AS could then select which specific method it wants to use, and could then tell the client which one it intends to use. This enhancement would increase interoperability between the client and the AS, and it would be less likely that a fallback to using polling would be required. At the moment there are only two specific interaction finish methods defined, but there could be more in the future - so this enhancement may be even more important in the future if there are more interaction finish methods to select between.

I do appreciate that making this change would introduce additional complexity to the protocol, but in this instance I think the extra complexity would be worth it as it would increase interoperability and it would increase security (by the reducing the need for polling to be required).

jricher commented 1 year ago

We discussed this as an option early on in the process, and the consensus at the time was that there could be multiple ways to start interaction but only one to finish. If there are multiple, is the client expected to be able to set up all of them? There is overhead and setup required on the client side for the finish methods -- listening to a URL for either a browser call or a direct push, saving a nonce value, etc. The security ramifications of the two methods defined in the core are very different, and who's to say what an extension would bring to the table. This gets really messy when the AS says "yes" to multiple finish methods -- which one should the client expect a response on? Is it now a race condition of whoever gets an answer first? What if the client gets an answer on more than one channel? This level of ambiguity and overhead was considered to lead to potential security issues with the client potentially setting up an endpoint that the AS will never use (because it's going to pick the other one), and thus opening itself to an injection attack of some kind via the open endpoint. In other words, the client's expecting something to happen, the AS says "no", but the attacker says "yes" and takes advantage of the opening to leverage an attack.

Additionally, if the client needs to pre-optimize the call to the AS it can do a discovery call first and get the interaction_finish_methods_supported list back. Otherwise, the client can do a PATCH on the grant or create a new grant to propose a different interaction package.

But even so, let's follow this suggestion: if we let the client send multiple methods, then the AS would need to pick exactly one (or zero) from the list and return it. The client would then need to only activate any "listening" on these endpoints once it gets back the positive response, which is the really case anyway with redirect/push because the AS could decide to return nothing in the finish section of the response. I do like how this fits the "offer a set of options and pick what works" transaction negotiation model that GNAP is engineered around, so that's a possibility. I'm not sure what this does for implementations though -- and what guidance, if any, does the AS need to pick the "right" response type?

This also brings up another old idea that was dropped -- sending multiple packages of interaction methods in a single request. This would be another way to solve the same problem, but with the larger effect of being able to send a bundle like "redirect in and redirect out OR user-code in and push out". This did not achieve support because of the much larger complexity: #59 (but could be an optional extension)

blue-ringed-octopus-guy commented 1 year ago

I've managed to somehow fat-finger the issue as "closed" when I was replying. No idea how I managed to do that! Please could you re-open it again.

blue-ringed-octopus-guy commented 1 year ago

Thanks for the response Justin.

As I mentioned in my original message, I do appreciate that this will introduce additional complexity to the protocol, so we need to consider whether the trade-offs are worth the potential benefits or not.

I want to clarify that I'm certainly not proposing that the AS can say "yes" to more than one Finish Method. This proposal would only be viable if the AS is forced to select one specific Finish Method, and it tells the client which one it has selected (or of course there could be scenarios where there is no Finish Method selected, or no Finish Method requested by the client).

Regarding the comment about the client being able to mitigate this issue by performing a Discovery call first to identify which Finish Methods the AS supports - I had considered this, however, I have the following concerns with us relying on this happening:

  1. An AS may choose not to fully support the Discovery functionality, and so might not enable the client to identify an accurate list of Finish Methods that it supports.

  2. It puts additional burden on the client, rather than just allowing the client to tell the AS which Finish Method(s) it can support.

  3. The AS might support a particular Finish Method in general, but the AS might decide that this Finish Method is not appropriate in the context of a specific request - a scenario that is hinted at within the document. And so there is no mechanism for the client to identify that this is the case by making a Discovery call.

jricher commented 1 year ago

Thank you for the suggestion for an enhancement. While this would possibly be a useful feature, the editors believe that it would fit better as an extension after more implementation experience. If you would like to submit text to that effect, the editors would be glad to help pull that together with you.

blue-ringed-octopus-guy commented 1 year ago

Ok, thanks Justin.

I'm happy for this to be considered a possible future extension if the editors think that's the best approach. Can you clarify what text is needed on this initially? Do you need a summary of intent for this particular extension, or do you need the actual draft specification wording?

Are there any other examples of extensions to GNAP that you can link me to, or is this work that can only commence once the core GNAP protocol has been finalised and published?

jricher commented 1 year ago

I don't know of any formal extensions to GNAP in the wild yet -- there are a few groups that I've personally spoken with who are looking to build out extensions, mostly in the interactions space, but I haven't seen any submitted documents from them yet.

In particular here though we'd be looking for actual specification text, preferably in the form of an I-D. This would be an extension to the interaction section, either defining a different "finish" method that takes in multiple methods as sub-items, or defining a means of sending an array as the value to the "finish" parameter on the request and what the AS is expected to do with this.

blue-ringed-octopus-guy commented 1 year ago

Ok, thanks for clarifying. I'll see if I can find time to make a start on this over the coming weeks.