better blazor scale-out documentation

schmitch commented 4 years ago

currently our company want's to scale-out blazor a little bit, however the documentation is really lacking, especially what happens if redis/azure signalr services is used as a backend.

I mean currently SignalR itself can be configured without sticky sessions and redis. However there is nearly no documentation, if something like that can be done with blazor.

Also what happens in case of a server failure, can a secon server take over the work when signalr runs with a redis backing store? or does blazor still keep everything in memory?

pranavkm commented 4 years ago

@schmitch there's a fair bit of documentation about deploying Blazor with Azure SignalR Service here: https://docs.microsoft.com/en-us/aspnet/core/host-and-deploy/blazor/server?view=aspnetcore-3.1#signalr-configuration which states that sticky sessions are required. https://docs.microsoft.com/en-us/aspnet/core/blazor/state-management?view=aspnetcore-3.1 also covers some details about how to effectively manage state with multiservers.

Do these documents cover your questions?

nullpainter commented 4 years ago

I'm in the same boat as @schmitch. I have read that sticky sessions are not required when using Redis and if the only transport is WebSockets, but this appears to require client-side configuration. I can't see any documentation about whether this is possible to configure with Blazor.

I've implemented custom state management for ephemeral state, but this isn't really the issue.

What I'm looking for is a practical guide on how to redeploy or upgrade nodes in a multi-node cluster without the "Could not reconnect to the server. Reload the page to restore functionality." error being displayed to end users.

I was assuming - perhaps naively, having not played with SignalR before - that I could use a Redis backplane, somehow disable the need for sticky sessions, and all state is replicated across servers.

I modified ConfigureServices to add Redis support to SignalR:

services.AddServerSideBlazor();
services.AddSignalR().AddStackExchangeRedis(...);

I can see pub/sub channels created, but I can't see any messages being sent. Should there be?

tl;dr - how is one supposed to scale out Blazor applications, not use Azure SignalR Service, and maintain zero-downtime for end users? The state management page is only part of the puzzle, and that seems a fairly straightforward one to address.

nullpainter commented 4 years ago

This is a similar question to #9734, with the same state management page discussed, however this doesn't address that poster's concerns either.

It's straightforward to persist application state (view models etc.) via an ad-hoc state manager, but what is resulting in the 'could not reconnect' message? Is there some way to implement a non ad-hoc manager - i.e. some sort of state manager registered with Blazor - so Blazor automatically reconnects?

I assumed this message was relating to the SignalR connection being broken, so I'm confused why the state management page keeps being referenced.

nullpainter commented 4 years ago

Or is the message that SignalR terminations are unavoidable, and that the Blazor approach is for the developer to to maintain state as per ASP.NET Core Blazor state management?

physix99 commented 4 years ago

I also have come across the same issue. It's impossible to update a blazor server side app without all connected clients receiving the connection disconnected message - which I understand why, but I'd really like a workaround.

The same thing also happens when you use Azure Deployment slots and using Azure SignalR service, as ultimately the signalr circuit is being lost.

Your idea of using Redis for the signalr connections would have been great if that worked. Is it possible for that to be implemented?

@danroth27 can the client circuit object be serialized?

danroth27 commented 4 years ago

@physix99 There are a number of different ways you can handle persisting the app state for a given client: https://docs.microsoft.com/aspnet/core/blazor/state-management?pivots=server In .NET 5, the support for using protected browser storage will be built in to the framework.

physix99 commented 4 years ago

@danroth27 Thanks Dan for your reply.

What I was actually referring to was serializing the signalr client object for Blazor Server. Which is what i believe contains all the information about the client (e.g. information needed to render the DOM).

What i'm after is being able to have 2 back end blazor servers for load balancing. Azure Signalr service in front of them - sticky sessions turned off.

Then if we need to update the blazor server application, we can do 1 server. All the signalr connections to that server will be dropped, but Azure will automatically join them to the other server. Because the Blazor Server circuit has been serialised between the two (using whatever mechanism needed), the client will join to the remaining online server with a very minimal reconnecting screen.

Obviously if there has been big changes the client will need to refresh the page to remake the connection, but for all major changes it should be good enough?

Thanks

ghost commented 3 years ago

Thank you for contacting us. Due to a lack of activity on this discussion issue we're closing it in an effort to keep our backlog clean. If you believe there is a concern related to the ASP.NET Core framework, which hasn't been addressed yet, please file a new issue.

This issue will be locked after 30 more days of inactivity. If you still wish to discuss this subject after then, please create a new issue!

ghost commented 3 years ago

Thanks for contacting us. We're moving this issue to the Next sprint planning milestone for future evaluation / consideration. We will evaluate the request when we are planning the work for the next milestone. To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.

mkArtakMSFT commented 3 years ago

Reopening to document this.

leonkosak commented 3 years ago

To summarize this: If there is Redis server and clients connect with SignalR only via WebSockets, Blazor Server-Side application is properly scaled-out (and therefore no need for Sticky Sessions), right?

runtimeware commented 2 years ago

Adding to this: It would be great to see examples of typical HA scenarios or documentation that helps describe what needs to be done to achieve HA. For example, clear practical guidance on how to get clients re-connected in situations where code is being deployed would be great (the pathways to leverage slots for deployments for example for Blazor Server w/ Azure SignalR, or SignalR + Redis)

ghost commented 1 year ago

Thanks for contacting us.

We're moving this issue to the .NET 8 Planning milestone for future evaluation / consideration. We would like to keep this around to collect more feedback, which can help us with prioritizing this work. We will re-evaluate this issue, during our next planning meeting(s). If we later determine, that the issue has no community involvement, or it's very rare and low-impact issue, we will close it - so that the team can focus on more important and high impact issues. To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.

diegosasw commented 1 year ago

I'm trying to find out scalability limitations for blazor server in order to decide whether I'm better off with blazor wasm and explicit websocket communication, for an app that requires real time communication, using some external API Gateway (e.g: AWS Api Gateway with websocket support) to keep as many websocket connections open as needed.

Where could I find answers to this? Would vertical scalability in Blazor Server app's machine solve the need for an external API Gateway? Would horizontal scalability (a second instance of blazor server app) cause connectivity issues to clients?

It'd be great to read better documentation about it (happy to read your suggestions).

danroth27 commented 1 year ago

I'm trying to find out scalability limitations for blazor server in order to decide whether I'm better off with blazor wasm and explicit websocket communication, for an app that requires real time communication

@diegosasw It depends on how much real time communication you need. If it's fairly limited, then you will likely get some scale out benefits from using Blazor WebAssembly. By comparison, Blazor Server requires an active real time connection to even function.

using some external API Gateway (e.g: AWS Api Gateway with websocket support) to keep as many websocket connections open as needed.

The Azure SignalR Service is specifically designed to handle connection scaleout.

Would vertical scalability in Blazor Server app's machine solve the need for an external API Gateway?

Possibly. It depends on how many concurrent users you need to be able to handle and how many concurrent connections the server can handle.

Would horizontal scalability (a second instance of blazor server app) cause connectivity issues to clients?

It shouldn't as long as you setup sticky sessions so that clients reconnect to the same server instance.

schmitch commented 1 year ago

Btw in this issue it was basically answered: https://github.com/dotnet/aspnetcore/issues/38611#issuecomment-982028457 thus blazor always needs stickey sessions since it never uses the redis backend

ghost commented 1 year ago

Thanks for contacting us.

We're moving this issue to the .NET 9 Planning milestone for future evaluation / consideration. We would like to keep this around to collect more feedback, which can help us with prioritizing this work. We will re-evaluate this issue, during our next planning meeting(s). If we later determine, that the issue has no community involvement, or it's very rare and low-impact issue, we will close it - so that the team can focus on more important and high impact issues. To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.

maxsargentdev commented 11 months ago

Just trying to work out some issues using blazor in my organization.

Is there any reason why blazor server doesn't support using redis as a backplane, when a vanilla signalr connection does support it?

Seems like it would be a nice feature.

ghost commented 10 months ago

Thanks for contacting us.

We're moving this issue to the .NET 9 Planning milestone for future evaluation / consideration. We would like to keep this around to collect more feedback, which can help us with prioritizing this work. We will re-evaluate this issue, during our next planning meeting(s). If we later determine, that the issue has no community involvement, or it's very rare and low-impact issue, we will close it - so that the team can focus on more important and high impact issues. To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.

n2029-ndensan commented 6 months ago

Currently, Blazor Server does not support cloud native. To support autoscaling, the architecture must be able to scale out and scale in. We recognize that we cannot safely scale in. I would like you to respond as soon as possible.

garrettlondon1 commented 2 months ago

@danroth27 If I remember, you mentioned at Build during our conversations that Azure SignalR service is no longer being recommended for Blazor Server scale-out. Is that still true?

In addition, Azure Front Door also doesn't even support Websockets, which has also been on the roadmap for 4+ years.

A lot of the answers on this thread about using Redis as a backplane have also gone unanswered...

InteractiveServer developers are, I believe, genuinely confused with the odds stacked against them if they want to develop Enterprise grade applications :(.

garrettlondon1 commented 2 months ago

https://learn.microsoft.com/en-us/aspnet/core/signalr/redis-backplane?view=aspnetcore-8.0

Using Redis as a SignalR backplane, I have a few thoughts.

Currently, Blazor Server has the default ComponentHub which handles interactive websocket SignalR connections containing the application state. And then, if you want to expose additional real-time functionality.. you have to do that through an additional Hub

When scaling beyond one instance, obviously that additional hub will just be sending messages to people on the same instance.

My understanding about using Redis as a backplane for Blazor Server is: it will not store application state at all, that still requires sticky sessions on the same instance, and the application state is still stored in memory in the aspnet process

Redis, will, though, correctly operate as a distributed backplane between all instances and real-time functionality on the additional hubs.

Using Redis as a backplane is not like using Azure SignalR service because my understanding is that the websocket connection does not live on the web server, but rather the client makes the websocket handshake with the Azure SignalR service.

While doing some testing, trying to validate this, I noticed that if services.AddSignalR() .AddStackExchangeRedis

contains a bad connection string, it will render all InteractiveServer components not functional.

The question is: why, if the Redis cluster does not contain any application state, and the websocket lives on the server, would InteractiveServer components not function

misiek08 commented 2 months ago

Reading this topic it looks like Blazor is still a toy. It’s not an Enterprise grade feature to have simple code deployments, simple scaling and HA (graceful failover of client between backends). 5 years ticking and not a single clear answer here…

Am I missing something or simple state serialization in storage like Redis could solve the issue?

garrettlondon1 commented 2 months ago

Reading this topic it looks like Blazor is still a toy. It’s not an Enterprise grade feature to have simple code deployments, simple scaling and HA (graceful failover of client between backends). 5 years ticking and not a single clear answer here…

Am I missing something or simple state serialization in storage like Redis could solve the issue?

Simple code deployments, scaling, and HA are all possible with Blazor..

Your user will lose websocket state if they handshake is broken or circuit breaks for sure.. but I don't know if storing application state in Redis is a good idea for Blazor Server

Blazor Server has realtime DOM updates fly over websocket so quickly that if application state is not stored on the server, I imagine the latency between the user <> server <> redis and back is too significant.

Redis as a normal SignalR backplane that does not communicate DOM updates, but realtime functionality, can survive this delay.. DOM updates at the speed of Blazor Server, I don't think so. Azure SignalR doesn't have this problem because the users connect directly to the SignalR service and not the application, so there is less latency.. although Azure SignalR is still a poor solution.

Also: when it comes to "sticky sessions" for websockets and HTTP requests being mixed in Blazor Web App..

I am confused as to why the ARRAffinity token is needed for Blazor Server. There is no session state, there is just a websocket connection and a circuit. If that circuit is broken, what's the problem if the user opens a wss connection to another server?
What's the problem if the Blazor Web App makes an HTTP call to server A, but websocket traffic is going through server B.

I think the lack of documentation/answers here leaves a lot to be wondered about how Blazor and InteractiveServer should be used in production and enterprise grade workflows

garrettlondon1 commented 2 months ago

A Blazor app prerenders in response to the first client request, which creates UI state on the server. When the client attempts to create a SignalR connection, the client must reconnect to the same server.

https://learn.microsoft.com/en-us/aspnet/core/blazor/fundamentals/signalr?view=aspnetcore-8.0#use-session-affinity-sticky-sessions-for-server-side-webfarm-hosting

@guardrex , so if Prerendering in Blazor is disabled, sticky sessions are not needed?

EDIT:

Sticky sessions are always required for Blazor Server due to disconnection/reconnection to same circuit

guardrex commented 2 months ago

I doubt it, but this is for the product unit to address. I'm 👂 for their further remarks that I can place into the session affinity guidance.

garrettlondon1 commented 1 month ago

I did manage to try this out on App Service with multiple instances with session affinity

Using only InteractiveServer with pre-rendering disabled. It did not work, received below errors about 50% of the time. Enhanced navigation worked fine when circuit was established, but not when hard refreshing.

@rendermode @(new InteractiveServerRenderMode(false)) , with static pages and static router

blazor.web.js:1  WebSocket connection to 'wss://app.com/_blazor?id=8upHQjrBq7LzcNkAmI4r3Q' failed: 

blazor.web.js:1 [2024-09-10T04:35:02.483Z] Information: (WebSockets transport) There was an error with the transport.
blazor.web.js:1  [2024-09-10T04:35:02.483Z] Error: Failed to start the transport 'WebSockets': Error: WebSocket failed to connect. The connection could not be found on the server, either the endpoint may not be a SignalR endpoint, the connection ID is not present on the server, or there is a proxy blocking WebSockets. If you have multiple servers check that sticky sessions are enabled.
blazor.web.js:1  [2024-09-10T04:35:02.514Z] Error: Failed to start the connection: Error: Unable to connect to the server with any of the available transports. Error: WebSockets failed: Error: WebSocket failed to connect. The connection could not be found on the server, either the endpoint may not be a SignalR endpoint, the connection ID is not present on the server, or there is a proxy blocking WebSockets. If you have multiple servers check that sticky sessions are enabled. ServerSentEvents failed: Error: 'ServerSentEvents' does not support Binary. Error: LongPolling failed: Error: No Connection with that ID: Status code '404'
blazor.web.js:1  [2024-09-10T04:35:02.514Z] Error: Error: Unable to connect to the server with any of the available transports. Error: WebSockets failed: Error: WebSocket failed to connect. The connection could not be found on the server, either the endpoint may not be a SignalR endpoint, the connection ID is not present on the server, or there is a proxy blocking WebSockets. If you have multiple servers check that sticky sessions are enabled. ServerSentEvents failed: Error: 'ServerSentEvents' does not support Binary. Error: LongPolling failed: Error: No Connection with that ID: Status code '404'
(anonymous) @ blazor.web.js:1
setTimeout
rootComponentsMayRequireRefresh @ blazor.web.js:1
onDocumentUpdated @ blazor.web.js:1
Ki @ blazor.web.js:1
blazor.web.js:1  [2024-09-10T04:35:02.514Z] Error: Failed to start the circuit.

garrettlondon1 commented 1 month ago

Related #58078

and #58079

misiek08 commented 1 week ago

Simple code deployments, scaling, and HA are all possible with Blazor..

Your user will lose websocket state if they handshake is broken or circuit breaks for sure..

I'll try to write a simple page and try to get back with results. It sounds like whole logic with break here.

but I don't know if storing application state in Redis is a good idea for Blazor Server

I'm not sure about all the state of DOM. Maybe some of them can be rebuilt from "the" state. Keeping whole DOM in Redis is stupid, I agree.

Blazor Server has realtime DOM updates fly over websocket so quickly that if application state is not stored on the server, I imagine the latency between the user <> server <> redis and back is too significant.

I'm not sure if it would be so bad, because I could implement "sharded circuit storage" and just scale Redis horizontally. All I need here is a way to redeploy or just scale Blazor without hard session affinity. So user is connected to same server for some longer period, but if I need to redeploy or scale up it can easily reconnect elsewhere, server will rebuild state getting it from the storage and just continue serving the user.

Redis as a normal SignalR backplane that does not communicate DOM updates, but realtime functionality, can survive this delay.. DOM updates at the speed of Blazor Server, I don't think so. Azure SignalR doesn't have this problem because the users connect directly to the SignalR service and not the application, so there is less latency.. although Azure SignalR is still a poor solution.

Can SignalR be used as complete solution or are we talking about message bus only?

I think the lack of documentation/answers here leaves a lot to be wondered about how Blazor and InteractiveServer should be used in production and enterprise grade workflows

That's why I wrote it looks like a toy. With all those sealed classes around circuits it looks like (after just reading code via Github, without even cloning) it's impossible to really use it in enterprise, long running UIs. I have a case with 500k people online on the website and Blazor performance is okayish for the case, but all HA and resillency topics look bad. Currently I use Go backend, looking into C# for migration and React frontend.

The connection could not be found on the server

That's what I want to serialize and save in Redis (we can do this on server shutdown and/or connection lost/close) and read when client connects to restore state.

garrettlondon1 commented 1 week ago

Simple code deployments, scaling, and HA are all possible with Blazor.. All I need here is a way to redeploy or just scale Blazor without hard session affinity.

See https://github.com/dotnet/aspnetcore/issues/58079

Session affinity is always required for Blazor Server because reconnection process has to connect to same server since thats where circuit is stored and pre-rendered view according to Javier

dotnet / aspnetcore

better blazor scale-out documentation #17986