dotnet / orleans

Cloud Native application framework for .NET
https://docs.microsoft.com/dotnet/orleans
MIT License
10.08k stars 2.03k forks source link

Add SQL Server section to Typical Configurations #3329

Closed Cronan closed 7 years ago

Cronan commented 7 years ago

In the documentation section Typical Configurations, there is no section detailing SQL Server client and server configurations.

What would this look like?

sergeybykov commented 7 years ago

The whole Configuration section of the docs needs to be rewritten to emphasize programmatic config and move away from promoting XML config. In 2.0 it will take another step forward - to SiloBuilder.

Cronan commented 7 years ago

I'm doing it all programmatically, but am not sure what the SQL version would look like - I could rewrite the doc to do the same things with config - but SQL would still be a mystery :-)

On 21 Aug 2017, at 19:24, Sergey Bykov notifications@github.com wrote:

The whole Configuration section of the docs needs to be rewritten to emphasize programmatic config and move away from promoting XML config. In 2.0 it will take another step forward - to SiloBuilder.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

Cronan commented 7 years ago

I managed to get it working with the following host config - am I doing anything wrong here, maybe missing some helpers?

var dataConnection =
    @"Data Source=MyDbServer;Initial Catalog=OrleansTest;Integrated Security=True;Pooling=False;Max Pool Size=200;Asynchronous Processing=True;MultipleActiveResultSets=True";

var hostName = Dns.GetHostName();
var hostEntry = Dns.GetHostEntry(hostName);
IPAddress address;
if (hostEntry.AddressList.Length > 0)
    address = hostEntry.AddressList[0];
else
    throw new Exception("Could not get local address");

var config = new ClusterConfiguration
{
    Globals =
    {
        DataConnectionString = dataConnection,
        DeploymentId = "OrleansTest",
        ResponseTimeout = TimeSpan.FromSeconds(30),

        LivenessType = GlobalConfiguration.LivenessProviderType.SqlServer,
        LivenessEnabled = true,
        ReminderServiceType = GlobalConfiguration.ReminderServiceProviderType.SqlServer,

    },
    Defaults =
    {
        HostNameOrIPAddress = address.ToString(),
        Port = 11111,
        ProxyGatewayEndpoint = new IPEndPoint(address, 30000),
        PropagateActivityId = true,

        DefaultTraceLevel = Severity.Info,
        TraceToConsole = false,
        TraceFilePattern = @"Silo_{0}-{1}.log"
    }
};

config.Globals.RegisterStorageProvider<AdoNetStorageProvider>("OrleansSqlStore", new Dictionary<string, string>()
{
    ["AdoInvariant"] = "System.Data.SqlClient",
    ["DataConnectionString"] = dataConnection,
    ["UseJsonFormat"] = "true"
});

_siloHost = new SiloHost(_siloName, config);

and the following client code:

var dataConnection =
    @"Data Source=MyDbServer;Initial Catalog=OrleansTest;Integrated Security=True;Pooling=False;Max Pool Size=200;Asynchronous Processing=True;MultipleActiveResultSets=True";

var config = new ClientConfiguration
{
    GatewayProvider = ClientConfiguration.GatewayProviderType.SqlServer,
    AdoInvariant = "System.Data.SqlClient",
    DataConnectionString = dataConnection,
    ResponseTimeout = TimeSpan.FromSeconds(30),

    DeploymentId = "OrleansTest",
    PropagateActivityId = true,

    DefaultTraceLevel = Severity.Info,
    TraceToConsole = false,
    TraceFilePattern = @"Client_{0}-{1}.log",

};
var client = new ClientBuilder().UseConfiguration(config).Build();
await client.Connect();
sergeybykov commented 7 years ago

This looks correct to me. It's not clear to me if you need to handle host name and IP address yourself instead of relying on the default behavior when address isn't set. Setting ResponseTimeout to the default value seems superfluous, too. But I don't see anything wrong.

If you are willing to contribute an update to the doc, that will be highly appreciated.

veikkoeeva commented 7 years ago

@Cronan A good example to be had sure. A few tips Pooling=False is problematic for performance in general and for Orleans in particular since Orleans uses connections in using(...) { ... } and naturally it's quite a bit faster to keep the connections alive on and rely on pooling (more here). I would advice removing it and maybe explicitly advice against using it.

Another point to make here is that since the pools are created and connections pooled to the respective pools by using the connection string as the key, it's sensible to potentially high-frequency operations, such as using ADO.NET storage, to a separate pool and set its Min Pool Size and Max Pool Size (or let maximum be undefined) to something than the minimum default of 0 so as to have it primed and ready.

Additional tip in this regard, and useful in general, is to set the Application Name to distinguish the various connection strings. in the database. Then maybe in the future in "advanced debugging tips and tricks" we could have some SQL queries that show how to get slow queries etc.

There are some DB vendor specific tips too, such as with PostgreSQL using pieces such as PgBouncer and setting server side pooling limits, but maybe best treated in the future in some advanced section too.

Ah, and yet more... Maybe worth nothing that though the script "just works", one might actually want to set up custom schemas etc. to which to deploy the database with advanced features such as security lockdowns -- and this happens by modifying the script a bit to suite one's needs. Maybe to advanced section also?

Cronan commented 7 years ago

@sergeybykov I handled the address myself because I couldn't work out how to set Defaults.ProxyGatewayEndpoint (IPEndPoint) with only a port - so I didn't want the null string logic for Defaults.HostNameOrIPAddress to (potentially) get a different address. Am I missing something?

Cronan commented 7 years ago

@veikkoeeva I found Pooling=False here and just copied it.

Your point about using different (potentially pre-primed) pools for things like storage is a good one, as is the Application Name note. Thank you.

I don't totally understand the note about custom schemas.

Generally, I'm curious about the move away from config files - my production pattern is to build the code in something like TeamCity, then use Octopus for deployment management, swapping out config files based on environment. Obviously I can do this still, but it means setting up my own custom config, and writing multiple code paths to support the options I need.

I'm sure there is something philosophical behind this, but I don't understand it right now.

sergeybykov commented 7 years ago

@Cronan I suggested in #3337 to take the address handling out of it to minimize the potential confusion.

Defaults.HostNameOrIPAddress to (potentially) get a different address. Am I missing something?

This setting allows for using different IP addresses for client gateway and silo-to-silo connection. Although I see your point that it's somewhat confusing for the most typical case of a single IP address. We should address this as part of the SiloBuilder effort I think. /cc @ReubenBond

veikkoeeva commented 7 years ago

@Cronan I wrote about schemas uncompletely and left out other equally important things, so perhaps that's why it's difficult to follow. A new try:

For production loads, there is at least two other important considerations

Filegroups

Schemas

One example of filegroups with Orleans is to store Storage table to a different filegroups so that if there is a hardware failure, corruption case or something else that requires disaster recovery, one can get the membership protocol backup online within seconds even if state storage is being restored. It may be also smart to notice one can use multiple filegroups for the Storage table, modify the INSERT, UPDATE and SELECT to work with multiple tables split to different filegroups, some of which might reside in fast SSD, be encrypted or somesuch division. But even without these kinds of considerations, it could very well make sense to put state data to a separate filegroup.

Schemas can be used to enforce security. For instance, disallow everyone expect the Orleans process (when running Orleans with a user rights of its own, derived e.g. from AD) and DB Owner cannot modify the membership or stream data and only Orleans and some specific batch jobs can modify Storage. Similarily to the previous point about filegroups, some data could be protected with extra isolation by separating it to schema defined in a different way.

These scenarios require a bit modifying the default script, but are a consequence of the fact that Orleans cares only about the parameter types and names when operating with the database (and in case state is streamed, also the order of parameters as it's mandated by the TDS protocol). Anyway, some of this is explained (although not throughly) in the current instructions too.

Cronan commented 7 years ago

@sergeybykov Thanks, I just responded in the pull request #3337

Cronan commented 7 years ago

@veikkoeeva Very interesting, I'll run this past our DBAs. I think we already do something similar to deal with massive tables and DR.