Azure / azure-relay-aspnetserver

ASP.NET Core Hosting for Azure Relay
Other
15 stars 30 forks source link

Listener does not reconnect after a network outage #12

Open mark-vw opened 5 years ago

mark-vw commented 5 years ago

After a network outage on prem, the registration with the azure relay service is broken and does not reconnect. The listener should be able to know when the connection is broken so it can attempt to reconnect and create events related the service interruption.

coolhome commented 5 years ago

I have experienced this a few times before using the azure relay SDK (not the asp.net core integration).

I have been having a hard time reproducing it or pin pointing where in the code this happens.

mark-vw commented 5 years ago

The asp.net package includes an internal AzureRelayListener class which uses a HybridConnectionListener to connect. Since the HybridConnectionListener exposes connectivity events (Online, Connected, & Offline), these events should be exposed so that action can be taken when a listener goes offline. Otherwise, listeners will not know they have been disconnected and are unable to respond.

coolhome commented 5 years ago

In my instance we are using the HybridConnectionListener directly and subscribed to those events. During a network event our listener tried to reconnect several times but eventually stopped trying to reconnect all together until we forced a service restart.

We started to log the LastException that is exposed. If it happens again hopefully I will have sufficient enough information to report back on.

edit: I have not tried upgrading to 2.0.1 that was released recently

dlstucki commented 5 years ago

There have been some issues similar to this which were fixed in 2.0.0+. I'd absolutely recommend getting onto 2.0.1 before spending any more time on this issue to get all known bug fixes and the best tracing.

If you're dealing with HybridConnectionListener directly make sure you subscribe to the Connecting, Online, and Offline events, these are raised when connection status changes occur and that's when LastError actually means something.

        /// <summary>
        /// Raised when the Listener is attempting to reconnect with ServiceBus after a connection loss. 
        /// Check LastError for more details.
        /// </summary>
        public event EventHandler Connecting;

        /// <summary>
        /// Raised when the Listener has successfully connected or reconnected with ServiceBus.
        /// LastError will be null at this point.
        /// </summary>
        public event EventHandler Online;

        /// <summary>
        /// Raised when the Listener will no longer be attempting to recconnect with ServiceBus.
        /// Reasons include user-initiated listener close or the HybridConnection management object
        /// was deleted (e.g. via portal or ARM).
        /// Check LastError for more details when this event is raised unexpectedly.
        /// </summary>
        public event EventHandler Offline;
mark-vw commented 5 years ago

Microsoft.Azure.Relay.AspNetCore package is still in preview (1.1.0-preview-20180522-1) and uses Microsoft.Azure.Relay (1.1.0-preview). Are there plans to update the Microsoft.Azure.Relay.AspNetCore package to use Microsoft.Azure.Relay (2.01) and bring it out of preview?

dlstucki commented 4 years ago

Microsoft.Azure.Relay.AspNetCore package is still in preview (1.1.0-preview-20180522-1) and uses Microsoft.Azure.Relay (1.1.0-preview). Are there plans to update the Microsoft.Azure.Relay.AspNetCore package to use Microsoft.Azure.Relay (2.01) and bring it out of preview?

This is now done. https://www.nuget.org/packages/Microsoft.Azure.Relay.AspNetCore/1.2.10592

Can this issue be closed? If not, please describe where things are at.

adrosa commented 3 years ago

Is this issue still active?