OrleansContrib / SignalR.Orleans

SignalR backend based on Orleans.
MIT License
289 stars 60 forks source link

SignalR Hubs do not receive messages from server-side senders after silo restarts #117

Open kehigginbotham opened 3 years ago

kehigginbotham commented 3 years ago

Related to #99. Also related to this Orleans issue

Version 1.4.0 Orleans 2.4.1 ADO.NET Membership provider ADO.NET PubSubStore ADO.NET GrainStorage

Minimal reproduction: 2+ silos (our prototype environment has 3) and 1+ aspnetcore web API instances. Rolling restart of the silos (only one down at a time). Messages sent via HubContext on server side are inconsistently/not received by aspnetcore web API instances.

Synopsis: When we perform a rolling restart of our silos, we experience inconsistent behavior from the streams backing the OrleansHubLifetimeManager. Presumably this is because the clients are losing their stream handles (or said stream handles are being lost when the respective silo is restarted).

Contemplations: I've considered side-chaining a process on each aspnetcore web API instance which receives regular (30 seconds) heartbeats from the server stream - if the heartbeat is not received, action would be taken to re-establish connectivity. However, I'm unsure what action would be necessary, as the internal stream handles in OrleansHubLifetimeManager would be invalid/stale.

Questions: Is there an idiomatic way to re-establish connectivity with the streams when the silos are restarted, without restarting the aspnetcore web API instance?

kehigginbotham commented 3 years ago

@stephenlautier @galvesribeiro Any thoughts / workarounds?

nkosi23 commented 3 years ago

Are you using the PubSubStore with persistence as suggested in #99 ?

kehigginbotham commented 3 years ago

Are you using the PubSubStore with persistence as suggested in #99 ?

We are, we use ADO.NET (SQL Server) for persistence, as suggested in #99

digisimon commented 3 years ago

@kehigginbotham

Have you been able to you solve this issue? I noticed the exact same behaviour, messages are not received anymore after scaling down silo instances.

I use AzureTableGrainStorage for PubSub and ORLEANS_SIGNALR_STORAGE_PROVIDER persistence.