dotnet / orleans

Cloud Native application framework for .NET
https://docs.microsoft.com/dotnet/orleans
MIT License
10.04k stars 2.02k forks source link

Orleans in App Services: An Update #7391

Closed SebastianStehle closed 9 months ago

SebastianStehle commented 2 years ago

The original issue is closed, therefore I cannot update it: https://github.com/dotnet/orleans/issues/7098

But I have some news, because a customer of mine found out how to run Orleans in Azure App Services.

What you have to do:

You have to use the following enviornment variables then:

public static async Task<ISiloHost> StartSilo()
{
    IPAddress endpointAddress = IPAddress.Parse(Environment.GetEnvironmentVariable("WEBSITE_PRIVATE_IP"));
    var strPorts = Environment.GetEnvironmentVariable("WEBSITE_PRIVATE_PORTS").Split(',');
    if (strPorts.Length < 2)
    {
      throw new Exception("Insufficient private ports configured.");
    }
    int siloPort = int.Parse(strPorts[0]);
    int gatewayPort = int.Parse(strPorts[1]);
    // define the cluster configuration
    var builder = new SiloHostBuilder()
      //.UseLocalhostClustering()
      .Configure<ClusterOptions>(options =>
      {
        options.ClusterId = "dev";
        options.ServiceId = "OrleansBasics";
      }​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​)
      .UseAzureStorageClustering(options => options.ConnectionString = "...")
      .Configure<EndpointOptions>(options =>
      {
        options.AdvertisedIPAddress = endpointAddress;
        options.SiloPort = siloPort;
        options.GatewayPort = gatewayPort;
        options.SiloListeningEndpoint = new IPEndPoint(IPAddress.Any, siloPort);
        options.GatewayListeningEndpoint = new IPEndPoint(IPAddress.Any, gatewayPort);
      }​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​)
      // Application parts: just reference one of the grain implementations that we use
      .ConfigureApplicationParts(parts => parts.AddApplicationPart(typeof(HelloGrain).Assembly).WithReferences())
      .ConfigureLogging(logging => logging.AddConsole());
    var host = builder.Build();
    await host.StartAsync();
    return host;
}

I have not tested it myself yet, but It works with my CMS: https://github.com/squidex/squidex and I hope we can provide docs for Squidex soon, with some screenshots.

christiansparre commented 2 years ago

@SebastianStehle NICE! Will have to try it! 👍

Maybe @ReubenBond or another one with access can post a link to this issue on #7098 so folks who find that one also finds their way here!

bradygaster commented 2 years ago

Can validate this approach works, as I was working on a similar end-to-end test of it and got it working around 2 this morning. :) I shared it with @ReubenBond and he gave me this link - we're working with the App Service team to craft guidance and docs in this area - who wants to be a reviewer for it? :)

SebastianStehle commented 2 years ago

I will probably try it out as well for Squidex CMS, so I can review the docs (and copy paste some parts)

christiansparre commented 2 years ago

I will definitely be trying it. I have something coming up where App Service might be a slightly better fit than Azure Container Apps so this news comes just at the right time 😁 Seems like Orleans in Azure is just poppin' right now 🚀

christiansparre commented 2 years ago

I have been playing around with Orleans on App Service a bit the last couple of days. And this seems like a game changer to us to be honest. We are pretty comfortable running somewhat simple Web Apps and API's on app service. I have been looking for a way to introduce Orleans for some upcoming systems where we need to orchestrate certain stuff that Orleans could be a good fit for 😀

However we need to beware of app service scaling. I have managed to make Orleans extremely mad when either scaling out, but maybe in particular when scaling instance sizes up. App service seems to keep the "old" instances up for rather long time to be honest in the order of 10s of minutes it would seem, might be normal app service behaviour. I need to look more at how that works. I'm not that familiar with the app service scaling stuff.

SebastianStehle commented 2 years ago

it is the same problem as with kubernetes. Perhaps we need an app service provider which scans for instances (not sure if this is possible).

bradygaster commented 2 years ago

I was inspired by this discussion and the repo @christiansparre worked on to show how to run Orleans in Container Apps (would love to sync up on this if you ever have an hour). In prep for the ASP.NET Community Standup show with @jongalloway today I'd started working up some samples, and worked back some of the commonalities of those samples into an experimental PoC. Wanted to share and get folks' thoughts on this approach/idea.

Imagine being able to wire up a silo and/or client hosted in Azure (Web Apps, Container Apps, etc.) with a single line of code that had an opinionated approach to wiring them both up based on conventional wisdom and practices, but that didn't get in your way if you needed to customize it further. The idea being, getting started with a Silo + Client scenario that runs in Azure would be as easy as a single NuGet package addition (at most) and wire-up like this:

Silo:

builder.Host.UseOrleans(siloBuilder =>
{
    siloBuilder.HostSiloInAzure(builder.Configuration);
});

Client:

builder.Services.AddOrleansClusterClient(builder.Configuration);

Curious what folks think - I essentially worked up a prototype/poc that shows a path through in which we set up a series of responsibilities that are performed in a similar order to what the docs have listed here, assigning each step of the build-up process to individual classes. It's nowhere near complete, totally experimental, being put out there for discussion.

Here's the extent of what is enlisted (and how):

public static ISiloBuilder HostSiloInAzure(this ISiloBuilder siloBuilder, IConfiguration configuration)
{
    // cluster/silo meta
    var clusterOptionsBuilder = new ClusterOptionsBuilder();
    var siloOptionsBuilder = new SiloOptionsBuilder();

    // storage
    var tableStorageBuilder = new TableStorageSiloBuilder();

    // endpoints
    var webAppSiloBuilder = new WebAppsVirtualNetworkEndpointsBuilder();
    var configuredEndpointsBuilder = new ConfiguredEndpointsBuilder();

    // monitoring
    var appInsightsBuilder = new AzureApplicationInsightsSiloBuilder();

    // nothing was configured, go with localhost
    var localhostBuilder = new LocalhostSiloBuilder();

    // set up the chain of responsibility
    clusterOptionsBuilder.SetNextBuilder(siloOptionsBuilder);
    siloOptionsBuilder.SetNextBuilder(tableStorageBuilder);
    tableStorageBuilder.SetNextBuilder(webAppSiloBuilder);
    webAppSiloBuilder.SetNextBuilder(configuredEndpointsBuilder);
    configuredEndpointsBuilder.SetNextBuilder(appInsightsBuilder);
    appInsightsBuilder.SetNextBuilder(localhostBuilder);

    // build the silo
    clusterOptionsBuilder.Build(siloBuilder, configuration);

    return siloBuilder;
}

The inspiration for this issue/discussion - wiring up the endpoints from the baked-in App Service settings on a vnet-enabled Web App that has => 2 vnetPrivatePorts set up, is implemented by WebAppsVirtualNetworkEndpointsBuilder.

public class WebAppsVirtualNetworkEndpointsBuilder : AzureSiloBuilder
{
    public override void Build(ISiloBuilder siloBuilder, IConfiguration configuration)
    {
        if (configuration.GetValue<string>("WEBSITE_PRIVATE_IP") != null &&
            configuration.GetValue<string>("WEBSITE_PRIVATE_PORTS") != null)
        {
            // presume the app is running in Web Apps on App Service and start up
            IPAddress endpointAddress = IPAddress.Parse(configuration.GetValue<string>("WEBSITE_PRIVATE_IP"));

            var strPorts = configuration.GetValue<string>("WEBSITE_PRIVATE_PORTS").Split(',');

            if (strPorts.Length < 2) throw new Exception("Insufficient private ports configured.");

            int siloPort = int.Parse(strPorts[0]);
            int gatewayPort = int.Parse(strPorts[1]);

            siloBuilder.ConfigureEndpoints(endpointAddress, siloPort, gatewayPort);
        }

        base.Build(siloBuilder, configuration);
    }
}

This enables someone new to using Orleans to just do something like this in their application code:

builder.Host.UseOrleans(siloBuilder =>
{
    siloBuilder.HostSiloInAzure(builder.Configuration);
});

Of course, if folks want to extend it further, it's "just a siloBuilder," so they could:

builder.Host.UseOrleans(siloBuilder =>
{
    siloBuilder.HostSiloInAzure(builder.Configuration);
    siloBuilder.AddAzureTableGrainStorage(name: "visitHistoryStore");
    siloBuilder.ConfigureLogging(builder => builder.SetMinimumLevel(LogLevel.Warning).AddConsole());
});

I've also emulated this on the client side, too, by enabling someone who wants to dial up a client to do something like this:

builder.Services.AddOrleansClusterClient(builder.Configuration);

That services wire-up would hook in a standard IHostedService designed to perform the same sort of logic flow to setup the Orleans Client.

Service wire-up code:

public static class ClientBuilderExtensions
    {
        public static IServiceCollection AddOrleansClusterClient(this IServiceCollection services, IConfiguration configuration)
        {
            services.AddSingleton<OrleansClusterClientHostedService>();
            services.AddSingleton<IHostedService>(svc => svc.GetRequiredService<OrleansClusterClientHostedService>());
            services.AddSingleton<IClusterClient>(svc => svc.GetRequiredService<OrleansClusterClientHostedService>().Client);

            return services;
        }
    }

Client builder, based on the same sort of config-driven convention (not much there yet):

public class OrleansClusterClientHostedService : IHostedService
{
    private readonly ILogger<OrleansClusterClientHostedService> _logger;
    private readonly IConfiguration _configuration;
    private int _retries = 10;
    public IClusterClient Client { get; set; } = null;

    public OrleansClusterClientHostedService(ILogger<OrleansClusterClientHostedService> logger, IConfiguration configuration)
    {
        _logger = logger;
        _configuration = configuration;

        var clientBuilder = new ClientBuilder();
        var clusterOptionsClientBuilder = new ClusterOptionsClientBuilder();
        var azureStorageSiloClientBuillder = new AzureStorageSiloClientBuillder();
        var localhostSiloClientBuilder = new LocalhostSiloClientBuilder();

        clusterOptionsClientBuilder.SetNextBuilder(azureStorageSiloClientBuillder);
        azureStorageSiloClientBuillder.SetNextBuilder(localhostSiloClientBuilder);

        clusterOptionsClientBuilder.Build(clientBuilder, configuration);
        Client = clientBuilder.Build();
    }

// start/stop methods...
}
SebastianStehle commented 2 years ago

@bradygaster

I think 2 things would be helpful:

  1. An extension method to use WEBSITE_PRIVATE_IP and so on.
  2. An extension method that works similar to kubernetes clustering.

But I don't see why...

  1. Application insights should be preconfigured. I think it is not that popular and I also know people using new relic with azure and so on. And I hope that ApplicationInsights integration will be obsolete soon when Open Telemetry is ready.

  2. Storages are preconfigured. Even in azure there are a lot of storage options like SQL Server, Cosmos DB and azure table storage. It also makes the package management more complicated because you need a meta package that references all needed packages.

SebastianStehle commented 2 years ago

Btw: It could be possible to write an integration of app services: https://docs.microsoft.com/en-us/rest/api/appservice/web-apps/list-instance-processes

There is an API and it also provides environment variables.

bradygaster commented 2 years ago

@SebastianStehle oh it isn't difficult at all - one can achieve it in bicep pretty easily. (wait - maybe you could explain your idea a bit more - i thought i knew what you were saying, but, want to make sure)

SebastianStehle commented 2 years ago

@bradygaster Orleans works great for static IP addresses. Honestly I don't know why it doesnt not support host names but this is another story.

In an environment with dynamic IP addresses you have the following problem: Lets say a node will be restarted. Then it might get a new IP addresses, but in some cases it cannot mark the old entry as dead. The new node will then try to get the status of each entry in the membership table and it takes a while until the old entry is marked as dead. With static IP addresses you do not have this problem because the new node will understand that it is the incarnation of an old node.

Therefore the kubernetes implementation queries the kubernetes service for all members with a given label and deletes all entries from the membership table that are not part of the deployment anymore: https://github.com/dotnet/orleans/blob/main/src/Orleans.Hosting.Kubernetes/KubernetesClusterAgent.cs#L251

For azure app services we need something similar, but I recommend to provide an interface for this agent class and reuse the logic.

I guess it only needs two methods:

Task DeleteSiloAsync(Silo silo);

IEnumerable<SiloDeleted> SubscribeToDeletionsAsync(CancellationToken ct);
ReubenBond commented 2 years ago

@nickrandolph wrote a tutorial on how to configure Orleans in Azure App Service: https://nicksnettravels.builttoroam.com/tutorial-orleans-azure-app-service/

oising commented 2 years ago
  1. And I hope that ApplicationInsights integration will be obsolete soon when Open Telemetry is ready.

Application Insights is an ingestion endpoint and graphical query tool/UX. Open Telemetry is the wire protocol. AppInsights competes with Grafana, Prometheus etc. When OTEL is complete, Application Insights will be able to understand the protocol natively; it won't be going anywhere.

SebastianStehle commented 2 years ago

Yes, but as a library builder you do not have to support all tools anymore. Just support OTLP and you are fine.

oising commented 2 years ago

Yes, but as a library builder you do not have to support all tools anymore. Just support OTLP and you are fine.

Ah, sorry - I misunderstood you. Yeah, the AI protocol will go away (eventually).

SebastianStehle commented 2 years ago

Not only the application insights integration but also the abstraction and custom telemetry implementation.

bradygaster commented 2 years ago

I completely spaced posting this link here, as we've created a sample for how to deploy Orleans to both App Service and ACA (though that one's currently getting some updates as the ACA RP changed, and I'm updating it to reflect those changes).

https://github.com/bradygaster/OrleansOnAzureAppService

Also - @IEvangelist has an upcoming doc SPECIFIC to deploying Orleans to App Service, and he's got a great sample of how to build a relatively canonical scenario using Orleans - a Shopping Cart app. Stay tuned for that, as it's going to drop extremely soon.

If folks would like, maybe David and I could stream the doc once we're close-to-finished and see if we've accommodated the basics folks need to know how to get started. Let us know if this is something folks would attend and provide some feedback so we could make it solid.

oising commented 2 years ago

https://github.com/bradygaster/OrleansOnAzureAppService

One of the primary benefits of Orleans is location transparency and elastic scaling. It would be really nice to showcase that with this example. I wonder how hard it would be to parameterize the bicep templates to add 1..n silos? Thoughts, @bradygaster ?

bradygaster commented 2 years ago

I'm definitely doing that in my upcoming Azure Container Apps example, as ACA supports KEDA (and even external KEDA scalers, to my delight). I love the idea of an auto-scaling App Service cluster. This is bigger, though. :) Exciting.

bradygaster commented 2 years ago

https://github.com/bradygaster/OrleansOnAzureAppService

One of the primary benefits of Orleans is location transparency and elastic scaling. It would be really nice to showcase that with this example. I wonder how hard it would be to parameterize the bicep templates to add 1..n silos? Thoughts, @bradygaster ?

Definitely like the idea. Let me see if I can up the ante, though, on another sample app I have running in App Service with an Orleans back-end. https://wanderland.cloud, the app we used for a few internal scenarios, was something we put together to show a never-ending Orleans silo running a lot of grains. Each "wanderer" - the players running around the board - are individual WandererGrain instances that literally wander around randomly.

I'd like to extend this so that you could do something like (this is incomplete, just an idea, open for suggestions):

[NamedWanderer("bradyg")]
public class BradyWanderer : IWandererGrain
{
}

And then, in my Wander method - the one that runs to make a player move - could have my own implementation in it. With that, folks could submit a pull request, write their own logic for avoiding the monster and staying alive longer, and append the deployment to run on their own App Service instance. Then, when we do a deployment, the random wanderers would run as before, but when a user logs in to the system, their wanderer implementation would run. At that point, if folks wanted to, they could also augment the bicep template to create their "own" app service, and place their player grain on it. That way the game grains could run in the main silo and the player grains in their independent silos.

It's contrived as heck, I know, but, I wanted to propose it as something the community could build and play together.

janiskulj commented 1 year ago

Hi, I'm putting together POC for Azure App Service Silo clustering but I'm getting very strange error, looks like ports get shifted:

Azure App Service (Silo) Environment Variables WEBSITE_PRIVATE_PORTS = 20144,20145

Silo registration in orleansmembershiptable (Ado.Net Clustering provider): 20146, 20147

Consequentially client (in the same VNET) is getting wrong ports and can't connect. Any suggestions?

esandoval1 commented 1 year ago

i have two apps using Orleans and are each in different regions to be redundant. I'm having issues getting them to communicate. they are both behind the same application gateway, was there any setting needing to be made on the application gateway to allow those ports? any special configuration needing to setup on the NSG when you have two app in different regions? i can tcpping each app private IP, but no response when i define specific port like 11111 or 30000.

janiskulj commented 1 year ago

My current understanding is that Azure App Service Clustering (running multiple silo or separate client) does not work. There are a few blog posts out there (and this thread), that make it look like it should, but that is just not the case. I architected my app with this assumption, but now it looks like I'll have to scale it back to single all-in-one app (like shopping cart example).

oising commented 1 year ago

My current understanding is that Azure App Service Clustering (running multiple silo or separate client) does not work. There are a few blog posts out there (and this thread), that make it look like it should, but that is just not the case. I architected my app with this assumption, but now it looks like I'll have to scale it back to single all-in-one app (like shopping cart example).

I may be misremembering this, but it might be the case that clustering works with Linux app service plans, but not Windows (or vice-versa?)

janiskulj commented 1 year ago

My current understanding is that Azure App Service Clustering (running multiple silo or separate client) does not work. There are a few blog posts out there (and this thread), that make it look like it should, but that is just not the case. I architected my app with this assumption, but now it looks like I'll have to scale it back to single all-in-one app (like shopping cart example).

I may be misremembering this, but it might be the case that clustering works with Linux app service plans, but not Windows (or vice-versa?)

I tried both. With Linux I had problems setting up private ports (there is no WEBSITE_PRIVATE_PORTS env variable there). I'm currently focusing on Windows version.

oising commented 1 year ago

Maybe this is just some underhanded way to move people to ACA 🗡️ It used to work, now it does not. But -- never attribute to malice what can be adequately explained by incompetence, right? I assume somebody broke something.

bradygaster commented 1 year ago

@janiskulj - the App Service Windows team recently implemented a change in their networking configuration that blocks N-Web Apps-per-1-Web App Plan, so the Orleans scenario was broken. It wasn't an underhanded push to ACA, but, it does change the team's approach to recommending the offerings. Since this networking change, we've been more strongly encouraging folks use ACA or AKS for Orleans cluster host scenarios.

ReubenBond commented 9 months ago

We have a doc page now: https://learn.microsoft.com/en-us/dotnet/orleans/deployment/deploy-to-azure-app-service. Please open a new issue if you hit a specific problem.