Closed annatisch closed 6 years ago
Hi @annatisch
I'll look at this right after fixing the issue of links being inadvertently destroyed when one link is destroyed on the same session.
There were several prior discussions on the need to add redirect so we were fully aware this will pop up at some point. Now seems the time to do it :-).
Thanks! /Dan
That sounds great - thanks! Let me know if there's something I can do :)
@dcristoloveanu - just wondering what you think the timeline might look like for this. I'm hoping to put a new release of the Python bindings out beginning of next week (week of the 28th) and would like to hold off until this fix is included if possible. Or do you believe this will likely be quite involved?
Thanks!
@annatisch I'll have a look today at how to plug this in and will come back with the timeline once I understood the details.
Hi @annatisch
I had a quick look at what I think should be done:
Redirect at connection level: The information with the hostname/port has to be bubbled up to the user code that created the IO used by connection. Connection itself has no knowledge of how to setup the IO and what its parameters are. So the way to go is to indicate to the user the fact that a redirect is needed and a new connection to a different server/port should be made. To achieve this:
The Python SDK would need to shutdown everything upon this event and recreate the IO/connection/session/link/etc. with the new configuration. I assume that would be a matter of running the same code as today, but with a different host/server.
Adding the event mechanism should be doable in a couple of days, I will target early next week, but don't hold me accountable if it is not 100% complete by then :-).
Redirect at link level: Link level redirect seems to follow a similar pattern. It looks like link would require the same mechanism of notifying the user a redirect is needed for that link and then it is in uAMQP user's hands to figure out if a new stack of connection/session/link needs to be spawn for the new server (or maybe it is actually the same connection that is used, but a new link needs creating)
Which of the 2 do you need first? I assume connection, but would be good to know :-).
/Dan
@dcristoloveanu - awesome, thanks for the investigation! That event mechanism sounds mint. Definitely looking forward to that (both short and long term versions)
Regarding which of the two redirects is a priority is a bit of a gray area that I'm hoping @digimaun can clarify for me. My understanding is that IotHub actually raises a link:redirect, however the contents of the redirect is a new hostname. So by behavior it's actually a connection:redirect, but by implementation it's a link:redirect (as it's not actually received until after a successful connection has been opened).
Thanks for bringing me in @annatisch.
Yes for certain operations, IoTHub accepts the connection but raises an amqp: link: redirect with payload containing the proper eventhub namespace host (and an amqps endpoint) backing the IoTHub.
Providing a mechanism that allows the uamqp user to catch a redirect event with the critical details would be enough to fulfill my use case. I would then handle it in the application code.
Even better would be uamqp automatically handling the redirect (with some configuration argument enabling/disabling auto redirect). I would be happy either way!
@annatisch master now has a way to bubble up the information of a connection close received which contains the redirect information. I was oscillating between this generic connection_close_received handler and a dedicated connection_redirect event handler. I think it would be nice to have both probably and the user can choose which one to use ultimately. We might get there after I have the information also bubbled up for link.
I am now doing the same for link detach received so that the information reaches the uAMQP user and a redirect can then be coded.
Thanks, /Dan
@annatisch With the latest changes I believe the redirect information is bubbled up to the user for connection and link. Can you give this a try and let me know if it works for you?
Thanks, /Dan
Thanks so much @dcristoloveanu! I've started exposing it in the Python bindings and will test it out today :)
Thanks @dcristoloveanu - so I've been trying to test this but am running into some curious behaviour. I don't know whether it's expected, or whether I have done something incorrectly and I'm hoping @digimaun might be able to clarify.
Effectively, I send an ATTACH frame, then receive one back in response - however it does not have the source/target properties populated. The client detects this as an amqp-decode error and sends an END frame before I can receive the DETACH redirect from the service. Therefore the session has already started shutting down when the DETACH frame is received and the callback is never entered into.
My network trace looks like this:
'-> [ATTACH]* {receiver-link-39ecea96-0fa4-413f-bc26-e62ec3136faa,0,true,0,0,* {amqps://pyot.azure-devices.net/messages/events/},* {12313ac1-0854-4725-ac3d-0d5def1d6510},NULL,NULL,NULL,262144}'
'<- [ATTACH]* {receiver-link-39ecea96-0fa4-413f-bc26-e62ec3136faa,0,false,0,0,NULL,NULL,NULL,NULL,0,262144,NULL,NULL,NULL}'
'-> [END]* {* {amqp:decode-error,Cannot get link source from ATTACH frame}}'
'<- [DETACH]* {0,true,* {amqp:link:redirect,NULL,{[hostname:iothub-ns-pyot-500748-8fa45cb72e.servicebus.windows.net],[network-host:iothub-ns-pyot-500748-8fa45cb72e.servicebus.windows.net],[port:5671],[address:amqps://iothub-ns-pyot-500748-8fa45cb72e.servicebus.windows.net:5671/pyot/]}}}'
'-> [END]* {* {amqp:session:unattached-handle,}}'
'-> [CLOSE]* {}'
So @digimaun - do you know whether the response ATTACH frame is supposed to be missing some parameters and this therefore needs to be handled? Or do you think I've set up a malformed link which is failing for other reasons?
Thanks guys!!
@annatisch After some investigation I believe I found the problem, you need to append '$management' to the operation path so 'amqps://pyot.azure-devices.net/messages/events/$management'
Edit: Actually nevermind, it looks like appending $management prolongs frames being sent back and forth but the end state is the same where a DETACH containing the redirect data is incoming after END is sent by the client.
Here is an example of how the event hub node client does it. The pattern for to build the connection string is to hit the /messages/events/$management endpoint, expect an error, parse out the redirect info and return a new connection string.
Thanks @digimaun! Though unfortunately it didn't seem to help:
'-> [ATTACH]* {receiver-link-5cef528a-1645-45fa-b476-25e9937a8539,0,true,0,0,* {amqps://pyot.azure-devices.net/messages/events/$management},* {7f20b484-cb97-4380-b694-535c545ced90},NULL,NULL,NULL,262144}'
'<- [ATTACH]* {receiver-link-5cef528a-1645-45fa-b476-25e9937a8539,0,false,0,0,NULL,NULL,NULL,NULL,0,262144,NULL,NULL,NULL}'
'-> [END]* {* {amqp:decode-error,Cannot get link source from ATTACH frame}}'
'<- [DETACH]* {0,true,* {amqp:link:redirect,NULL,{[hostname:iothub-ns-pyot-500748-8fa45cb72e.servicebus.windows.net],[network-host:iothub-ns-pyot-500748-8fa45cb72e.servicebus.windows.net],[port:5671],[address:amqps://iothub-ns-pyot-500748-8fa45cb72e.servicebus.windows.net:5671/pyot/$management]}}}'
'-> [END]* {* {amqp:session:unattached-handle,}}'
'-> [CLOSE]* {}'
Is it possible that the Node.JS SDK doesn't enforce a source/target on the return ATTACH frame?
Good question. Looks like the Node JS SDK is using rhea for the AMQP implementation. I am not knowledgeable about it, we'll need to investigate how it works.
At least from an error perspective in the Event Hub SDK all the necessary details are in the error that is thrown so the handling seems pretty straightforward .
Transform the error: https://github.com/Azure/azure-event-hubs-node/blob/master/client/lib/errors.ts#L355
Parse redirect infos: https://github.com/Azure/azure-event-hubs-node/blob/master/client/lib/iothub/iothubClient.ts#L91
But yea maybe the configuration or interaction rules are different than uamqp.
This might be an issue in uamqp.
Basically the ATTACH with NULL endpoint indicates that the peer(server) does not want to create a terminus.
(1) The link endpoint is created, but no target is created. (2) At this point the link is established, but it is to a nonexistent target. Figure 2.33: Refusing a Link
Thus, I think there is a bug in uAMQP here where the session should not be ended. Let me have a look and fix this.
Thanks, /Dan
Thanks @dcristoloveanu
I played around with simply ignoring the missing source/target from the link ATTACH frame
in the if_attached_type_by_descriptor
section in session.c and it seems to run fine for both my existing tests and the IoThub scenario (though I haven't yet run the uAMQP-C tests):
if (attach_get_name(attach_handle, &name) != 0)
{
end_session_with_error(session_instance, "amqp:decode-error", "Cannot get link name from ATTACH frame");
}
else if (attach_get_role(attach_handle, &role) != 0)
{
end_session_with_error(session_instance, "amqp:decode-error", "Cannot get link role from ATTACH frame");
}
else
{
if (attach_get_source(attach_handle, &source) != 0)
{
source = NULL;
//end_session_with_error(session_instance, "amqp:decode-error", "Cannot get link source from ATTACH frame");
}
else if (attach_get_target(attach_handle, &target) != 0)
{
target = NULL;
//end_session_with_error(session_instance, "amqp:decode-error", "Cannot get link target from ATTACH frame");
}
LINK_ENDPOINT_INSTANCE* link_endpoint = find_link_endpoint_by_name(session_instance, name);
if (link_endpoint == NULL)
{
/* new link attach */
/* .... */
}
else
{
/* .... */
}
}
attach_destroy(attach_handle);
}
@annatisch Yep, that is the basic idea. I need to run through some more cases of what happens to the link and then I'll merge the change in.
Thanks for the patience on this! /Dan
@annatisch The fix should be in, I assume it will work for you as it is the same fix you posted in the thread :-)
Thanks! /Dan
Yes - it's working! Thank you so much :)
We have a need to support AMQP redirect. It seems a redirect response can occur either at the connection level or the link level, as documented here: http://docs.oasis-open.org/amqp-bindmap/amqp-wsb/v1.0/cs01/amqp-wsb-v1.0-cs01.html#_Toc461801619 http://docs.oasis-open.org/amqp-bindmap/amqp-wsb/v1.0/cs01/amqp-wsb-v1.0-cs01.html#_Toc461801620
I'm not sure what the best approach is here, either for this to be entirely handed at the connection/link layer within uamqp-c, or whether the redirect information should be surfaced so the consuming client can then handle as needed. The latter is probably a quicker option that provides a bit more flexibility.
Any thoughts on this?
I'm happy to look at implementing this, with some advice :)