Azure / azure-cosmos-dotnet-v3

.NET SDK for Azure Cosmos DB for the core SQL API
MIT License
741 stars 494 forks source link

[Per Partition Automatic Failover] Utilize `CosmosClientOptions` to Capture Custom Domain Names to resolve Cx Specified Endpoints #4236

Closed kundadebdatta closed 9 months ago

kundadebdatta commented 10 months ago

Problem Statement:

In real world, there are many reasons an endpoint could become non-responsive and stop fullfilling any incoming requests. Some examples for such reasons could be network packet drops, the server node experiencing issues or a larger outage. Today, while initializing, the .NET v3 SDK requires to fetch the account metadata information from the routing gateway, using the global account endpoint. This information is needed to figure out the read, write regions, the resource identifiers, ETag etc, which are needed by the SDK to perform Read/ Write operations. The global account endpoint is passed through the CosmosClient constructor (see the below example for more details).

CosmosClientOptions clientOptions = new CosmosClientOptions()
{
    ApplicationPreferredRegions = new List<string>()
    {
        Regions.NorthCentralUS,
        Regions.EastAsia,
    },
    EnablePartitionLevelFailover = true,
    ConnectionMode = ConnectionMode.Direct,
};

CosmosClient client = new CosmosClient(
    "https://testaccount.documents-test.windows-int.net:443/",
    "key==",
    clientOptions
);

However, if for some unforeseen reason, the global account endpoint becomes non-responsive, today there is no way to fetch the account metadata information, thus failing the cosmos client initialization.

Proposed Solution:

The above problem could be solved if the global account metadata information is hosted with-in a private domain name. During an outage, the custom domain names can be used to route the Get Account metadata requests to the custom endpoints, if the primary global account endpoint become non-responsive.

Below is an example of how the SDK will capture the regional endpoints from the end user.

 CosmosClientOptions clientOptions = new CosmosClientOptions()
    {
        ApplicationPreferredRegions = new List<string>()
        {
            Regions.P1,
            Regions.P2,
            Regions.P3
        },
        RegionalEndpoints = new List<string>()
        {
            { "custom.p-1.documents.azure.com" },
            { "custom.p-2.documents.azure.com" },
        },
        EnablePartitionLevelFailover = true,
    };

Acceptance Criteria:

Notes:

kirankumarkolli commented 9 months ago

RegionalEndpoints: few clarifications -Order matching PreferredRegions?