Azure / azure-cosmos-dotnet-v3

.NET SDK for Azure Cosmos DB for the core SQL API
MIT License
739 stars 493 forks source link

Get endpoint that was used in a query/operation #2333

Closed sebader closed 3 years ago

sebader commented 3 years ago

Is your feature request related to a problem? Please describe. I'm using DirectMode to connect to a multi-master Cosmos DB and am using ApplicationRegion to connect to the closest instance. That works well. However, as DirectMode has very limited automatic tracing with ApplicationInsights, I'm adding my own instrumentation like this:

        public async Task<Claim> GetClaimAsync(string claimId, string partitionkey)
        {
            var startTime = DateTime.UtcNow;
            ResponseMessage responseMessage = null;
            var success = false;
            try
            {
                // Read the item as a stream.
                using (responseMessage = await this._claimsContainer.ReadItemStreamAsync(
                    partitionKey: new PartitionKey(partitionkey),
                    id: claimId))
                {
                    if (responseMessage.IsSuccessStatusCode)
                    {
                        Claim claim = FromStream<Claim>(responseMessage.Content);
                        return claim;
                    }
                    else if (responseMessage.StatusCode == HttpStatusCode.NotFound)
                    {
                        // No Claim found for the id/partitionkey
                        return null;
                    }
                    else
                    {
                        throw new Exception($"Exception on GetClaimAsync query. Code={responseMessage.StatusCode}");
                    }
                }
            }
            finally
            {
                var telemetry = new DependencyTelemetry()
                {
                    Type = "Azure DocumentDB",
                    Data = $"ClaimId={claimId}, Partitionkey={partitionkey}",
                    Name = "Query GetClaim",
                    Timestamp = startTime,
                    Duration = responseMessage != null ? responseMessage.Diagnostics.GetClientElapsedTime() : DateTime.UtcNow - startTime,
                    Target = _dbClient.Endpoint.Host,
                    Success = success
                };
                if (responseMessage != null)
                    telemetry.Metrics.Add("CosmosDbRequestUnits", responseMessage.Headers.RequestCharge);
                _telemetryClient.TrackDependency(telemetry);
            }
        }

The problem is that _dbClient.Endpoint.Host will always return the main endpoint of the database, but not the regional one that was actually used. So instead of e.g. https://mydb-eastus2.documents.azure.com:443/ or https://mydb-northeurope.documents.azure.com:443/ this just gives me https://mydb.documents.azure.com:443/ This of course then skews my monitoring. Debugging into this I can see that, in the above example, _claimsContainer has an internal property ClientContext.DocumentClient which seems to have the information I'm looking for with ReadEndpoint and WriteEndpoint. image

Is there anyway to get this at runtime?

Describe the solution you'd like For proper tracing I need to know which endpoints a query actually hit.

Describe alternatives you've considered Using reflection...?!

j82w commented 3 years ago

@sebader can you try the latest 3.18.0-preview which contains a new API GetContactedRegions on the CosmosDiagnostics https://github.com/Azure/azure-cosmos-dotnet-v3/pull/2312? Please let us know if it doesn't solve your issue.

sebader commented 3 years ago

That looks exactly like what I’m looking for :) will give it a try tomorrow

sebader commented 3 years ago

thanks @j82w this works like a charm! :)

Can you maybe explain a bit more background when there would be more than one contacted region in the list? I assume that would that mean it has tried one region (how many times?) before it tried the next one?!

j82w commented 3 years ago

Please take a look at this documentation and make sure you have the client configured properly. By default it will only try the primary region.

https://docs.microsoft.com/en-us/azure/cosmos-db/troubleshoot-sdk-availability https://docs.microsoft.com/en-us/azure/cosmos-db/troubleshoot-dot-net-sdk?tabs=diagnostics-v3#retry-logic-

There is a lot of different scenarios that can cause it to retry on another region. The number of retries varies depending on the exception that is being hit. For example if the SDK gets a 503 when attempting to do a read item it will try going to secondary region assuming the SDK is properly configured. The ClientRetry policy handles most of the cross region failover logic.