dotnet / aspnetcore

ASP.NET Core is a cross-platform .NET framework for building modern cloud-based web applications on Windows, Mac, or Linux.
https://asp.net
MIT License
35.51k stars 10.04k forks source link

SignalR KeepAlive Interval not being honored - Blazor Server #48675

Closed garrettlondon1 closed 1 year ago

garrettlondon1 commented 1 year ago

Is there an existing issue for this?

Describe the bug

https://learn.microsoft.com/en-us/aspnet/core/blazor/fundamentals/signalr?view=aspnetcore-7.0#configure-signalr-timeouts-and-keep-alive-on-the-client

Documentation says:

The server timeout should be at least double the value assigned to the Keep-Alive interval. The Keep-Alive interval should be less than or equal to half the value assigned to the server timeout

When specifying the Keep-Alive interval as 5000ms and Server Timeout as 30000ms, the Keep-Alive still defaults to 15 seconds

Expected Behavior

Keep-Alive interval is honored. Currently if a user disconnects from SignalR or has a connection issue, when a button is pressed in Server-side blazor application the UI hangs for 15 seconds before displaying the reconnection modal.

One solution using the Keep-Alive should be to send more frequent pings so that it is quicker to tell when a client is not receiving DOM updates.

Steps To Reproduce

<script src="_framework/blazor.{HOSTING MODEL}.js" autostart="false"></script>
<script>
  Blazor.start({
    configureSignalR: function (builder) {
      let c = builder.build();
      c.serverTimeoutInMilliseconds = 30000;
      c.keepAliveIntervalInMilliseconds = 5000;
      builder.build = () => {
        return c;
      };
    }
  });
</script>

Using the script above if you inspect the WS connection in JS console, you will see pings of 15 seconds, regardless of Keep-Alive

Exceptions (if any)

No response

.NET Version

7.0.32

Anything else?

No response

BrennanConroy commented 1 year ago

That code snippet part of the docs shouldn't be showing for <=7.0, it's new in 8.0. @guardrex

guardrex commented 1 year ago

That was supposed to be the workaround for 7.0 or earlier. The new approach replacing it for 8.0+ is ...

<script src="_framework/blazor.{HOSTING MODEL}.js" autostart="false"></script>
<script>
  Blazor.start({
    configureSignalR: function (builder) {
      builder.withServerTimeout(30000).withKeepAliveInterval(15000);
    }
  });
</script>

Cross-refs:

I don't think @surayya-MS necessarily has any info on it at the time of the 8.0 content review, given that the workaround was placed years ago before her arrival. I'll need to dig back and see who sent it over/approved it for publication.

If it doesn't work at all (and perhaps it never did), then we'll need to pull all of our guidance on controlling both of those for versions earlier than 8.0.

I'll get back to you here by EOD with what I find out.

garrettlondon1 commented 1 year ago

Thanks all,

I think from a business perspective...

Once in a while, maybe .1% of times, a user will click a button, and nothing will happen for 5-10 seconds, then UI will finally figure itself out. I would imagine, if the websocket is disconnected, the reconnection modal appears, but it doesnt! That's why I tried decreasing the keep-alive interval.

I would rather the reconnection modal appear more frequently if the client loses connection to the server, instead of the UI not responding when a button is clicked and there is an expectation for something to happen.

Ultimately, people end up clicking again, or even worse, refreshing. I don't care if SignalR gets disconnected, and my clients do not mind the reconnection modal.

Is there a way to make the reconnection modal appear within like 1 second of the button getting clicked and not receiving a DOM update from the server?

guardrex commented 1 year ago

Originally, there was an incorrect bit of code shown for the workaround, which was ~my fault~ [CORRECTION: The incorrect code came from the PU.] However, we fixed that. Here's Javier saying that that code :point_up: should work ...

https://github.com/dotnet/aspnetcore/issues/18840#issuecomment-1288972469

... and then @surayya-MS is commenting further on it a little further down that issue saying in more detail why it should work at ...

https://github.com/dotnet/aspnetcore/issues/18840#issuecomment-1289026037

What I'll ultimately need to know is if that really is 😈 NOT a Thing™ 😈. In our chats about it (including earlier when I put a bad workaround up the first place), there was no other way to set those values in 7.0 or earlier other than the workaround. If it really doesn't work, then I'll need to strip it out of our docs and say that changing those values is unsupported (prior to our new approach for 8.0+).

guardrex commented 1 year ago

... and just to flesh out why it was orginally wrong, it was this at first ...

<script src="_framework/blazor.{HOSTING MODEL}.js" autostart="false"></script>
<script>
  Blazor.start({
    configureSignalR: function (builder) {
      builder.serverTimeoutInMilliseconds = 30000;
      builder.keepAliveIntervalInMilliseconds = 15000;
    }
});
</script>

... which came from @javiercn at ...

https://github.com/dotnet/aspnetcore/issues/42778#issuecomment-1187943582

... but like I said ... it's fixed in the current version of the doc ... IF it's a thing at all for 7.0 or earlier. :smile:

BrennanConroy commented 1 year ago

Oh nvm, serverTimeoutInMilliseconds and keepAliveIntervalInMilliseconds are the new APIs. I misread the docs for 7.0 😆

Using the script above if you inspect the WS connection in JS console, you will see pings of 15 seconds, regardless of Keep-Alive

Pings from the client or the server?

garrettlondon1 commented 1 year ago

I see pings from the server happening every 15 seconds, I'm not seeing anything from the client except acks from the server pings

Do i need to add:

services.AddSignalR(e =>
                {
                    e.MaximumReceiveMessageSize = 102400000;
                    e.KeepAliveInterval = TimeSpan.FromSeconds(5);
                }

also?

BrennanConroy commented 1 year ago

https://learn.microsoft.com/en-us/aspnet/core/signalr/configuration?view=aspnetcore-7.0&tabs=dotnet#configure-server-options-6

garrettlondon1 commented 1 year ago

The values should match for client & server correct?

garrettlondon1 commented 1 year ago

Also, is there an issue with putting the keep-alive super low? Like 3 seconds? Why is the default 15? I'm guessing because of mobile situations for Blazor Server?

garrettlondon1 commented 1 year ago

I have configured keep-alive on the client and the server to be 3 seconds, 5 seconds, etc.. Unable to get it working, still see 15 seconds..

Could you share a boilerplate project I can run to see? I am also using Azure SignalR Service if that matters

BrennanConroy commented 1 year ago

Also, is there an issue with putting the keep-alive super low? Like 3 seconds?

More bandwidth and higher latency connections will be closed more often.

The values should match for client & server correct?

They don't need to, although if you set the timeout to a value lower than the keep alive you'll have issues (and we generally recommend the timeout is twice the keep alive interval).

garrettlondon1 commented 1 year ago

More bandwidth and higher latency connections will be closed more often.

Won't the keep alive just trigger the reconnection modal if latency is high? The server timeout will still be 30 seconds, I just want the client to be aware that it's disconnected from SignalR sooner than the 15 second default, so the UI does not hang. My users are only in the United States, latency is very low.

BrennanConroy commented 1 year ago

I just want the client to be aware that it's disconnected from SignalR sooner than the 15 second default

The default timeout is 30 seconds. So an ungraceful disconnect will happen after 30 seconds, or if the OS tells us the connection is gone. The keep alive interval isn't what causes the connection to notice it's disconnected.

garrettlondon1 commented 1 year ago

There are two parts for blazor server.. my understanding is the reconnection modal appears "Attempting to reconnect" based on the keep-alive (default 15 seconds) sending a ping, and not receiving response from server.

The "Disconnected" modal appears when the server timeout is reached after client does not receive a message from server for 30 seconds

https://learn.microsoft.com/en-us/aspnet/core/blazor/fundamentals/signalr?view=aspnetcore-7.0#reflect-the-connection-state-in-the-ui-blazor-server

garrettlondon1 commented 1 year ago
serverTimeoutInMilliseconds: The server timeout in milliseconds. 
If this timeout elapses without receiving any messages from the server, the connection is terminated with an error. 
The default timeout value is **30 seconds**. The server timeout should be at least double the value assigned to the Keep-Alive interval (keepAliveIntervalInMilliseconds).

keepAliveIntervalInMilliseconds: Default interval at which to ping the server. This setting allows the server to detect hard disconnects, such as when a client unplugs their computer from the network. 
The ping occurs at most as often as the server pings. If the server pings every five seconds, assigning a value lower than 5000 (5 seconds) pings every five seconds. 
The default value is **15 seconds**. The Keep-Alive interval should be less than or equal to half the value assigned to the server timeout (serverTimeoutInMilliseconds).

The server timeout is not the issue.. The only issue is client side default for even knowing that client is disconnected is 15 seconds. This is the cause for UI hanging.

BrennanConroy commented 1 year ago

There are two parts for blazor server.. my understanding is the reconnection modal appears "Attempting to reconnect" based on the keep-alive (default 15 seconds) sending a ping, and not receiving response from server.

That part I can't comment on, but I'd be surprised if blazor was detecting the keep alive interval for this because the connection is still alive until the 30 second timeout occurs. @dotnet/aspnet-blazor-eng

garrettlondon1 commented 1 year ago

To be honest I'd rather the keep alive be like 1-2 seconds and spend more money on server resources, this way the "Attempting to reconnect" modal appears instantaneously if there is a SignalR disconnection which causes UI hanging.

The server will not time out unless those pings continuously fail for 30 seconds, which is unlikely unless the users internet went down or my server went down, where there is nothing I can do about those.

garrettlondon1 commented 1 year ago

@MackinnonBuck any input :) ? May need to add the blazor tag

garrettlondon1 commented 1 year ago

image

Ok, this is interesting...

Current configuration is:

services.AddServerSideBlazor(options =>
            {
                options.DetailedErrors = env == "Development";
                options.DisconnectedCircuitMaxRetained = 100;
                options.DisconnectedCircuitRetentionPeriod = TimeSpan.FromMinutes(3);
                options.JSInteropDefaultCallTimeout = TimeSpan.FromMinutes(1);
                options.MaxBufferedUnacknowledgedRenderBatches = 10;
            }).AddHubOptions(options =>
            {
                options.ClientTimeoutInterval = TimeSpan.FromSeconds(30);
                options.EnableDetailedErrors = false;
                options.HandshakeTimeout = TimeSpan.FromSeconds(15);
                options.KeepAliveInterval = TimeSpan.FromSeconds(2);
                options.MaximumParallelInvocationsPerClient = 1;
                options.StreamBufferCapacity = 10;

            });
services.AddSignalR(e =>
                {
                    e.MaximumReceiveMessageSize = 102400000;
                    e.KeepAliveInterval = TimeSpan.FromSeconds(2);
                }).AddAzureSignalR(options =>
                {
                    options.ServerStickyMode =
                        ServerStickyMode.Required;
                });
<script src="_framework/blazor.server.js" autostart="false"></script>
    <script>
        Blazor.start({
            configureSignalR: function (builder) {
                let c = builder.build();
                c.serverTimeoutInMilliseconds = 30000;
                c.keepAliveIntervalInMilliseconds = 2000;
                builder.build = () => {
                    return c;
                };
            }
        });
    </script>
garrettlondon1 commented 1 year ago

I think I am wrong about the way Blazor reconnection modal appears..

The server timeout in MS looks like its controlling the reconnection modal appearing, not the first missed ping from keep-alive? I can't tell

UPDATE

Setting keep alive at 1 second and server timeout at 3 seconds will function great, reconnection modal will appear within few seconds of button clicking when internet is disconnected, much better than the previous 15 seconds

From a business perspective, Blazor server can be a production ready framework if this reconnection modal just appears when the first ping during keep alive interval is not met from client to server. I am sure there are drawbacks but one of the reasons everybody is scared of server-side is not because of the reconnection modal, it's because of the delay between when a client clicks a button and when the reconnection modal appears.

Thanks all,

I think from a business perspective...

Once in a while, maybe .1% of times, a user will click a button, and nothing will happen for 5-10 seconds, then UI will finally figure itself out. I would imagine, if the websocket is disconnected, the reconnection modal appears, but it doesnt! That's why I tried decreasing the keep-alive interval.

I would rather the reconnection modal appear more frequently if the client loses connection to the server, instead of the UI not responding when a button is clicked and there is an expectation for something to happen.

Ultimately, people end up clicking again, or even worse, refreshing. I don't care if SignalR gets disconnected, and my clients do not mind the reconnection modal.

Is there a way to make the reconnection modal appear within like 1 second of the button getting clicked and not receiving a DOM update from the server?

garrettlondon1 commented 1 year ago

Moving to #30344

3hxx commented 1 year ago

@garrettlondon1 are you using @SteveSandersonMS 's implementation also? Also, is your configuration above your latest configuration? Blazor server user here and I'm trying to optimise my site as I'm really disliking the current implementation of this reconnect modal.