Azure / azure-signalr

Azure SignalR Service SDK for .NET
https://aka.ms/signalr-service
MIT License
425 stars 100 forks source link

Azure SingalR Service connection ended without reason on Blazor Server App #1463

Open monderino opened 3 years ago

monderino commented 3 years ago

We have a Blazor Server App which is connected to the Azure SignalR Service. More or less randomly we get a disconnect and our clients need to reload to establish the connection again. I added logs to the client and server side to find the reason, but the logs of the SignalR Service tells me only 'Connection ended. Reason: '. Have anyone an idea when this can happen?

All my configurations should avoid that the server close the connection. It should only happen when the client is really long offline. Server Config:

services
     .AddSignalR(e =>
     {
         e.MaximumReceiveMessageSize = 10240000;                  
         e.HandshakeTimeout = TimeSpan.FromSeconds(30);
         e.ClientTimeoutInterval = TimeSpan.FromMinutes(10);
     })
     .AddAzureSignalR(o =>
     {
         o.ConnectionString = $"Endpoint=https://{signalRHostName};AccessKey={signalRAccessKey};Version=1.0;";
         o.ServerStickyMode = ServerStickyMode.Required;
     });

services
    .AddServerSideBlazor(options =>
    {
        options.DetailedErrors = true;
        options.DisconnectedCircuitMaxRetained = 500;
        options.DisconnectedCircuitRetentionPeriod = TimeSpan.FromHours(1);
    })
    .AddHubOptions(options =>
    {
        options.EnableDetailedErrors = true;
        options.MaximumReceiveMessageSize = 10240000;
        options.HandshakeTimeout = TimeSpan.FromSeconds(30);
        options.ClientTimeoutInterval = TimeSpan.FromMinutes(10);
    })
    .AddCircuitOptions(options =>
    {
        options.DetailedErrors = true;
        options.DisconnectedCircuitMaxRetained = 500;
        options.DisconnectedCircuitRetentionPeriod = TimeSpan.FromHours(1);
    });

I use dotnet core 5.0 and Microsoft.AspNetCore.SignalR.Client Version=5.0.10 Microsoft.Azure.SignalR Version=1.11.0

The app is running in a kubernetes cluster on the mcr.microsoft.com/dotnet/sdk:5.0 image.

When a disconnect happen, the client show the following logs:

blazor.server.js:21 [2021-09-28T09:45:56.471Z] Information: Starting up blazor server-side application.
blazor.server.js:1 [2021-09-28T09:45:56.474Z] Information: Normalizing '_blazor' to 'https://my_url/_blazor'.
blazor.server.js:1 [2021-09-28T09:45:56.475Z] Debug: Starting HubConnection.
blazor.server.js:1 [2021-09-28T09:45:56.476Z] Debug: Starting connection with transfer format 'Binary'.
blazor.server.js:1 [2021-09-28T09:45:56.477Z] Debug: Sending negotiation request: https://my_url/_blazor/negotiate?negotiateVersion=1.
blazor.server.js:1 [2021-09-28T09:45:56.543Z] Debug: Sending negotiation request: https://signalr-production.service.signalr.net/client/negotiate?hub=componenthub&asrs.op=%2F_blazor&negotiateVersion=1&asrs_request_id=DAHRmI%2B1AAA%3D&asrs_lang=de.
blazor.server.js:1 [2021-09-28T09:45:56.770Z] Debug: Selecting transport 'WebSockets'.
blazor.server.js:1 [2021-09-28T09:45:56.889Z] Information: WebSocket connected to wss://signalr-production.service.signalr.net/client/?hub=componenthub&asrs.op=%2F_blazor&negotiateVersion=1&asrs_request_id=DAHRmI%2B1AAA%3D&asrs_lang=de&id=Q86RxsprcCJfeqXYmHYn-w16caeea61&access_token=eyJhbGciOiJIUzI1NiIsImtpZCI6IjY3MzUzOTUyNCIsInR5cCI6IkpXVCJ9.eyJhc3JzLnMuc24iOiJocS1ibGF6b3ItcHJvZHVjdGlvbi02NTY0NTg5NDk4LWh3Mjc4XzcxMzU1ODYyMWQ5ZjRiNzNiNjgxYzExZGM4Y2RiZDdlIiwiYXNycy5zLnNzdGlja3kiOiJSZXF1aXJlZCIsImFzcnMucy5jaHQiOiIzMCIsIm5iZiI6MTYzMjgyMjM1NiwiZXhwIjoxNjMyODI1OTU2LCJpYXQiOjE2MzI4MjIzNTYsImF1ZCI6Imh0dHBzOi8vaHEtc2lnbmFsci1wcm9kdWN0aW9uLnNlcnZpY2Uuc2lnbmFsci5uZXQvY2xpZW50Lz9odWI9Y29tcG9uZW50aHViIn0.HyJ6gtHT-KpGmSF5RWYzYg-mjHWBxbMG3XWwk-KJG1g.
blazor.server.js:1 [2021-09-28T09:45:56.889Z] Debug: The HttpConnection connected successfully.
blazor.server.js:1 [2021-09-28T09:45:56.890Z] Debug: Sending handshake request.
blazor.server.js:1 [2021-09-28T09:45:56.890Z] Information: Using HubProtocol 'blazorpack'.
blazor.server.js:1 [2021-09-28T09:45:56.947Z] Debug: Server handshake complete.
blazor.server.js:1 [2021-09-28T09:45:56.947Z] Debug: HubConnection connected successfully.
blazor.server.js:21 [2021-09-28T09:45:56.997Z] Debug: Received render batch with id 2 and 3470 bytes.
blazor.server.js:21 [2021-09-28T09:45:56.998Z] Debug: Applying batch 2.
blazor.server.js:21 [2021-09-28T09:45:57.002Z] Information: Blazor server-side application started.
blazor.server.js:21 [2021-09-28T09:45:57.125Z] Debug: Received render batch with id 3 and 84 bytes.
blazor.server.js:21 [2021-09-28T09:45:57.125Z] Debug: Applying batch 3.
...
blazor.server.js:21 [2021-09-28T09:45:59.204Z] Debug: Received render batch with id 35 and 72 bytes.
blazor.server.js:21 [2021-09-28T09:45:59.204Z] Debug: Applying batch 35.
blazor.server.js:1 [2021-09-28T09:52:27.231Z] Debug: HttpConnection.stopConnection(Error: WebSocket closed with status code: 1006 ().) called while in state Connected.
blazor.server.js:1 [2021-09-28T09:52:27.231Z] Error: Connection disconnected with error 'Error: WebSocket closed with status code: 1006 ().'.
blazor.server.js:1 [2021-09-28T09:52:27.231Z] Debug: HubConnection.connectionClosed(Error: WebSocket closed with status code: 1006 ().) called while in state Connected.
blazor.server.js:1 [2021-09-28T09:52:28.530Z] Information: Normalizing '_blazor' to 'https://my_url/_blazor'.
blazor.server.js:1 [2021-09-28T09:52:28.531Z] Debug: Starting HubConnection.
blazor.server.js:1 [2021-09-28T09:52:28.532Z] Debug: Starting connection with transfer format 'Binary'.
blazor.server.js:1 [2021-09-28T09:52:28.533Z] Debug: Sending negotiation request: https://my_url/_blazor/negotiate?negotiateVersion=1.
blazor.server.js:1 [2021-09-28T09:52:28.665Z] Debug: Sending negotiation request: https://signalr-production.service.signalr.net/client/negotiate?hub=componenthub&asrs.op=%2F_blazor&negotiateVersion=1&asrs_request_id=eF3f5uq1AAA%3D&asrs_lang=de.
blazor.server.js:1 [2021-09-28T09:52:28.859Z] Debug: Selecting transport 'WebSockets'.
blazor.server.js:1 [2021-09-28T09:52:28.943Z] Information: WebSocket connected to wss://signalr-production.service.signalr.net/client/?hub=componenthub&asrs.op=%2F_blazor&negotiateVersion=1&asrs_request_id=eF3f5uq1AAA%3D&asrs_lang=de&id=DMjWjXZKU6NyGnM8O2EiyQdf92f9231&access_token=eyJhbGciOiJIUzI1NiIsImtpZCI6IjY3MzUzOTUyNCIsInR5cCI6IkpXVCJ9.eyJhc3JzLnMuc24iOiJocS1ibGF6b3ItcHJvZHVjdGlvbi02NTY0NTg5NDk4LWh3Mjc4XzcxMzU1ODYyMWQ5ZjRiNzNiNjgxYzExZGM4Y2RiZDdlIiwiYXNycy5zLnNzdGlja3kiOiJSZXF1aXJlZCIsImFzcnMucy5jaHQiOiIzMCIsIm5iZiI6MTYzMjgyMjc0OCwiZXhwIjoxNjMyODI2MzQ4LCJpYXQiOjE2MzI4MjI3NDgsImF1ZCI6Imh0dHBzOi8vaHEtc2lnbmFsci1wcm9kdWN0aW9uLnNlcnZpY2Uuc2lnbmFsci5uZXQvY2xpZW50Lz9odWI9Y29tcG9uZW50aHViIn0.tkoeJaQVyFgMrmfJiuIPQN282738RPeM7jM5gmiVuGM.
blazor.server.js:1 [2021-09-28T09:52:28.943Z] Debug: The HttpConnection connected successfully.
blazor.server.js:1 [2021-09-28T09:52:28.943Z] Debug: Sending handshake request.
blazor.server.js:1 [2021-09-28T09:52:28.944Z] Information: Using HubProtocol 'blazorpack'.
blazor.server.js:1 [2021-09-28T09:52:28.973Z] Debug: Server handshake complete.
blazor.server.js:1 [2021-09-28T09:52:28.974Z] Debug: HubConnection connected successfully.
blazor.server.js:1 [2021-09-28T09:55:48.794Z] Debug: HttpConnection.stopConnection(undefined) called while in state Disconnecting.
blazor.server.js:1 [2021-09-28T09:55:48.795Z] Error: Connection disconnected with error 'Error: Server timeout elapsed without receiving a message from the server.'.
blazor.server.js:1 [2021-09-28T09:55:48.795Z] Debug: HubConnection.connectionClosed(Error: Server timeout elapsed without receiving a message from the server.) called while in state Connected.
blazor.server.js:1 [2021-09-28T09:55:50.927Z] Information: Normalizing '_blazor' to 'https://my_url/_blazor'.
blazor.server.js:1 [2021-09-28T09:55:50.928Z] Debug: Starting HubConnection.
blazor.server.js:1 [2021-09-28T09:55:50.928Z] Debug: Starting connection with transfer format 'Binary'.
blazor.server.js:1 [2021-09-28T09:55:50.928Z] Debug: Sending negotiation request: https://my_url/_blazor/negotiate?negotiateVersion=1.
blazor.server.js:1 [2021-09-28T09:55:51.010Z] Debug: Sending negotiation request: https://signalr-production.service.signalr.net/client/negotiate?hub=componenthub&asrs.op=%2F_blazor&negotiateVersion=1&asrs_request_id=bVppNieNAQA%3D&asrs_lang=de.
blazor.server.js:1 [2021-09-28T09:55:51.178Z] Debug: Selecting transport 'WebSockets'.
blazor.server.js:1 [2021-09-28T09:55:51.268Z] Information: WebSocket connected to wss://signalr-production.service.signalr.net/client/?hub=componenthub&asrs.op=%2F_blazor&negotiateVersion=1&asrs_request_id=bVppNieNAQA%3D&asrs_lang=de&id=2mCRE3qHKZ-FnDMck9-LSAba427abb1&access_token=eyJhbGciOiJIUzI1NiIsImtpZCI6Ii02OTYwODgxMzciLCJ0eXAiOiJKV1QifQ.eyJhc3JzLnMuc24iOiJocS1ibGF6b3ItcHJvZHVjdGlvbi02NTY0NTg5NDk4LThjZHM3XzQyODJhNDNjNWQ2YTQyYTQ5ZGFkMzE3ZTFjNzhkMGY0IiwiYXNycy5zLnNzdGlja3kiOiJSZXF1aXJlZCIsImFzcnMucy5jaHQiOiIzMCIsIm5iZiI6MTYzMjgyMjk1MSwiZXhwIjoxNjMyODI2NTUxLCJpYXQiOjE2MzI4MjI5NTEsImF1ZCI6Imh0dHBzOi8vaHEtc2lnbmFsci1wcm9kdWN0aW9uLnNlcnZpY2Uuc2lnbmFsci5uZXQvY2xpZW50Lz9odWI9Y29tcG9uZW50aHViIn0.uFLZtwH-2w_O5iv4RpSHjGcdOVV9TjbdP3b7UJ4lEZQ.
blazor.server.js:1 [2021-09-28T09:55:51.268Z] Debug: The HttpConnection connected successfully.
blazor.server.js:1 [2021-09-28T09:55:51.269Z] Debug: Sending handshake request.
blazor.server.js:1 [2021-09-28T09:55:51.269Z] Information: Using HubProtocol 'blazorpack'.
blazor.server.js:1 [2021-09-28T09:55:51.297Z] Debug: Server handshake complete.
blazor.server.js:1 [2021-09-28T09:55:51.298Z] Debug: HubConnection connected successfully.
blazor.server.js:21 [2021-09-28T09:55:51.325Z] Information: Reconnection attempt to the circuit was rejected by the server. This may indicate that the associated state is no longer available on the server.

And the SignalR tells me:

{
    "properties": {
        "message": "Connection ended. Reason: ",
        "collection": "Connection",
        "type": "ConnectivityLogs",
        "connectionId": "N-JNGUfa5E-6DH25zn41Iwdf92f9231",
        "userId": null,
        "transportType": "WebSockets",
        "connectionType": "Client"
    },
    "operationName": "ConnectionEnded",
    "category": "AllLogs",
    "level": "Informational",
    "callerIpAddress": "91.11.173.57",
    "resourceId": "/SUBSCRIPTIONS/....",
    "time": "2021-09-28T09:36:49Z",
    "location": "germanywestcentral"
}

Has anyone an hint for me, how I get detailed logs about the connection end or if maybe something wrong in my configuration. My goal is the reduce the disconnection as much as possible.

Thanks for your help.

JialinXin commented 2 years ago

@monderino Quickly checked our service log with some information from your traces and it looks the client are closed by the app server during incident time. Suggest you can open server logs to have a check if why server did that.

Guides about enable server logs. https://docs.microsoft.com/en-us/azure/azure-signalr/signalr-howto-troubleshoot-method#enable-server-side-logging-for-aspnet-core-signalr

image

monderino commented 2 years ago

Hi @JialinXin,

thank you for the tipp. I actually don't see any significant log.

The Logs in the signalR service tells me: "Connection ended. Reason: Application server closed the connection." or "Connection ended. Reason: "

and the logs in my application only say: Request starting HTTP/1.1 GET http://xxx.hellohq.io/images/error/connection_lost.svg - - The file "/images/error/connection_lost.svg" was not modified Request finished HTTP/1.1 GET http://xxx.hellohq.io/images/error/connection_lost.svg - - - 304 - image/svg+xml 0.2199ms Request starting HTTP/1.1 POST http://xxx.hellohq.io/_blazor/negotiate?negotiateVersion=1 text/plain;charset=UTF-8 0

no further error or anything.

JialinXin commented 2 years ago

@monderino From your original logs, there're 2 errors worth taking a look. It seems your client code has something wrong and you can check the websocket.onerror(evt) logs. Check this with some further guides.

blazor.server.js:1 [2021-09-28T09:52:27.231Z] Error: Connection disconnected with error 'Error: WebSocket closed with status code: 1006 ().'.

And this one seems that the process is stuck in some place and leads to timeout. You may either improve the TimeoutInterval a little bit or find out the stuck place.

blazor.server.js:1 [2021-09-28T09:55:48.795Z] Error: Connection disconnected with error 'Error: Server timeout elapsed without receiving a message from the server.'.

monderino commented 2 years ago

hi @JialinXin,

you mean I should increase the ClientTimeoutInterval?

I'm not sure how I can use the websocket.onerror(evt), because I use the blazor.server.js which does all the magic for me.

<script src="_framework/blazor.server.js" autostart="false"></script>
document.addEventListener("DOMContentLoaded", function() {
    Blazor.start({
            logLevel: 1, // LogLevel.Debug
            configureSignalR: builder => builder.configureLogging("debug"), // LogLevel.Debug
            reconnectionOptions: { // Retry 5 min
                maxRetries: 300,
                retryIntervalMilliseconds: 1000,
            }
        });
  });
JialinXin commented 2 years ago

Increase ClientTimeoutInterval may help mitigate second issue but I'm wondering it may be related to issue1 and something not working properly in your client code. You issue is likely to be similar with https://github.com/dotnet/aspnetcore/issues/21881

Can you repro the issue under local SignalR? Remove the dependency with Azure SignalR service to check if it repros. And if so, suggest to open an issue to server-side blazor team who are expects on this.

Or would you provide a simplified version of repro code to let me investigate this? I'm not able to provide much help under current information. Let me know how you'd like to proceed.

monderino commented 2 years ago

Thanks for the help. Currently I didn't find a way to reproduce it locally. I let you know when I find a way.