dotnet / orleans

Cloud Native application framework for .NET
https://docs.microsoft.com/dotnet/orleans
MIT License
10.06k stars 2.03k forks source link

[EPIC] Papercuts to fix for 4.0.0 #7099

Closed ReubenBond closed 1 year ago

ReubenBond commented 3 years ago

EDIT: The remaining items on this list have been migrated to #8257


This is an issue to log and track papercuts: small annoyances/inconveniences which developers often run into. Please feel free to add to the list and to pick off items on the list to fix.

krasin-ga commented 3 years ago

May adding support for stateless worker grains in heterogeneous silos be considered as papercut? :)

ReubenBond commented 3 years ago

@krasin-ga yes, it would. Which version are you running? I recall that we fixed an issue with StatelessWorker grains in heterogeneous clusters

krasin-ga commented 3 years ago

I am using version 3.4.2.

A little preface. I use the concept of "roles" to segregate the responsibilities of silos. So a silo with a certain role places only those grains that belong to it. Something like a set of micro clusters within a cluster. This allows for flexible resource management in silo deployment, faster localization of problems, and an additional level of isolation.

But this approach has a few drawbacks.

  1. This is the problem I was referring to. The documentation says that statless worker grains must be deployed on all silos. I got around this by using grains that handle one request and then get deactivated using the DeactivateOnIdle method. I didn't realize this could already be fixed :)

    1. The grain catalog is common to all silos and restarting or crashing a silo with one role affects silos with other roles. Some of the grains are deactivated. Part of the requests may be lost #6746. Both client to silo and silo to silo. It may cause some downtime when a large ResponseTimeout has been set.
  2. There were problems when starting clients and silos in a certain order. Clients kept not seeing certain grain implementations even though all the silos were already running by then. Maybe this was only within the time frame set by TypeMapRefreshInterval, I'll check that later.

krasin-ga commented 3 years ago

It would be nice to be able to register a cluster client as a service rather than building it as a host. This would be more in line with its responsibilities. Right now, to register a client inside another host (like asp .net core), I use UseServiceProviderFactory and remove IHostApplicationLifetime along with logging services so they don't overwrite my registrations. Then IHostedService is used to initiate the connection of the client to the cluster.

I also propose to make IClusterClient a generic type: IClusterClient<TCluster>. This would make it fairly easy to register and use multiple cluster clients within the application, including within the silo.

krasin-ga commented 3 years ago

@krasin-ga yes, it would. Which version are you running? I recall that we fixed an issue with StatelessWorker grains in heterogeneous clusters

I've checked, and everything actually works great. Thanks!

galvesribeiro commented 3 years ago

It would be nice to be able to register a cluster client as a service rather than building it as a host.

Maybe. The client is self contained. What I usually do, is to build an IClusterClient and register it on the generic Host DI container. After that, I register an IHostedService which injects this client and start/stop it. That way, when the application (i.e. ASP.Net) is starting, it will start the client and shutdown it gracefully with the shutdown.

Xanatus commented 3 years ago

May I suggest the good old timer reentrancy issue to be fixed? https://github.com/dotnet/orleans/issues/2574

ReubenBond commented 3 years ago

@Xanatus Thank you, good suggestion

ElanHasson commented 2 years ago

I register an IHostedService which injects this client and start/stop

@galvesribeiro

One of the things I've raised several times on the Discord is how many folks are looking for prescriptive guidance and best practices.

One of the great things about Orleans is how flexible it is and how it can be shaped to meet your exact use case and that probably has to do with not having lots of opinionated aspects to Orleans.

As Orleans gains additional momentum and adoption from the visibility of now being under the .NET Development Group, I think there may be a benefit to offering some level of guidance that meets the 80%.

A great example of this is the client connection story. Having to roll your own all the time can lead to subtle mistakes and gaps in the implementation.

It would be nice to have something out of the box that can just be configured and injected.

I haven't dug into 4.0 deeply yet so I'm not sure how much of this story has changed; please forgive me if this already exists.

galvesribeiro commented 2 years ago

I agree with you @ElanHasson.

There are some rework on the client itself of which it wont need its own DI container anymore and can be added directly to any IServiceCollection however, the initialization of the client still requires a an async call.

We have two options here:

  1. Pay the cost for the initialization on the first grain invocation transparently: This is "easier" but brings some complication to the table like queuing all the calls until the first one gets the client initialized and then starts dequeuing and invoking all others or any other synchronization mechanism;
  2. For applications that run using the Microsoft.Extensions.Hosting library we can just install an internal IHostedService which connects the client before all other services even start: This is what I manually do Today but this also wouldn't work if you are not using that library.

Do you have any suggestion what would be the best approach?

ElanHasson commented 2 years ago

btw here is an example of what I have to do: https://github.com/web-scheduler/web-scheduler-api/blob/main/Source/WebScheduler.Client.Core/HostedServices/ClusterClientHostedService.cs#L78-L115

I'm not even sure how correct that is, but it works. (i know I have primitive retry handling there.)

@galvesribeiro. The Orleans strength is in it's utility.

I like option #2.

A IHostedService that can take the client builder as configuration and handles everything for you needed for managing the IClusterClient. Something like this as a NuGet would be perfect.

ReubenBond commented 1 year ago

The remaining items on this list have been migrated to https://github.com/dotnet/orleans/issues/8257

Closing this - please bring up any relevant papercuts in the new thread.