dotnet / orleans

Cloud Native application framework for .NET
https://docs.microsoft.com/dotnet/orleans
MIT License
10.06k stars 2.03k forks source link

Orleans.Runtime.SiloUnavailableException in Service Fabric #6280

Closed rzgames closed 10 months ago

rzgames commented 4 years ago

Hi Team,

We have hosted Orleans on to Azure Service Fabric Cluster , and executed some chaos testing for resiliency & reliability of orleans application on Service Fabric Cluster. We have pass some load/throughput to Orleans and in parallel we have restarted one of the node in service fabric cluster. What we have observed that Orleans service getting borked and all the requests to Orleans are getting rejected. After doing some investigation , what we have observed status of the Orleans Service in Azure Storage account moved from Active to Joining and when the client tries to connect to Orleans Silo we can see 409 on Client Side and on the Orleans Side we see the below exceptions Application: IDVision.Dv.Silos.exe Framework Version: v4.0.30319 Description: The process was terminated due to an unhandled exception. Exception Info: Orleans.Runtime.SiloUnavailableException at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task) at Orleans.Runtime.Scheduler.AsyncClosureWorkItem+d__8.MoveNext()

Exception Info: System.AggregateException at System.Threading.Tasks.TaskExceptionHolder.Finalize()

Below is the Setup we are having for Orleans:

Orleans Version : 2.4.1 Hosting Env: Azure Service Fabric Cluster Size: 3 nodes VM size : B4MS

ghost commented 2 years ago

We've moved this issue to the Backlog. This means that it is not going to be worked on for the coming release. We review items in the backlog at the end of each milestone/release and depending on the team's priority we may reconsider this issue for the following milestone.

ReubenBond commented 10 months ago

Closing due to inactivity on our part. If this is still an issue, please open a new issue and reference this