Azure / azure-cosmos-dotnet-v3

.NET SDK for Azure Cosmos DB for the core SQL API
MIT License
723 stars 477 forks source link

DiagnosticSourceName seems to be "Azure.Cosmos" not "Azure.Cosmos.Operation" in stable version SDK #4541

Open blankor1 opened 2 weeks ago

blankor1 commented 2 weeks ago

We are continuously addressing and improving the SDK, if possible, make sure the problem persist in the latest SDK version.

Describe the bug When I use this sample code:https://github.com/Azure/azure-cosmos-dotnet-v3/blob/master/Microsoft.Azure.Cosmos.Samples/Usage/CustomDiagnosticAndEventListener/Program.cs, it works well with previous version.

But when I use stable version 3.40.0, the source name seems changed to "Azure.Cosmos". If I still use "Azure.Cosmos.Operation", I can not get the related activity in the listener.

To Reproduce Use stable version SDK not previou. Set CosmosClientOptions.DisableDistributedTracing = false. Listen to "Azure.Cosmos.Operation"

using CustomDiagnosticAndEventListener listener
    = new CustomDiagnosticAndEventListener(
        diagnosticSourceName: "Azure.Cosmos.Operation");

CosmosClientOptions options = new CosmosClientOptions()
{
    CosmosClientTelemetryOptions = new CosmosClientTelemetryOptions()
    {
        DisableDistributedTracing = false/*,
        CosmosThresholdOptions = new CosmosThresholdOptions()
        {
            PointOperationLatencyThreshold = TimeSpan.FromMilliseconds(100000),
        }*/
    }
};
using (CosmosClient client = new CosmosClient(endpoint, authKey, options))
{
    Console.WriteLine($"Getting container reference for {containerName}.");

    ContainerProperties properties = new ContainerProperties(containerName, partitionKeyPath: "/id");

    await client.CreateDatabaseIfNotExistsAsync(databaseName);
    Container container = await client.GetDatabase(databaseName).CreateContainerIfNotExistsAsync(properties);

    await Program.RunCrudDemo(container);
}
internal class CustomDiagnosticAndEventListener :
    EventListener, // Override Event Listener to capture Event source events
    IObserver<KeyValuePair<string, object>>, // Override IObserver to capture Activity events
    IObserver<DiagnosticListener>,
    IDisposable
{
    private readonly string diagnosticSourceName;
    //private readonly string eventSourceName;

    private ConcurrentBag<IDisposable>? Subscriptions = new();
    private ConcurrentBag<Activity> Activities { get; } = new();

    private readonly HashSet<string> TargetTags = new HashSet<string> { "db.operation", "db.name", "db.cosmosdb.container", "db.cosmosdb.operation_type", "db.cosmosdb.status_code", "db.cosmosdb.sub_status_code", "db.cosmosdb.request_charge" };

    public CustomDiagnosticAndEventListener(string diagnosticSourceName)
    {
        this.diagnosticSourceName = diagnosticSourceName;

        DiagnosticListener.AllListeners.Subscribe(this);
    }

    /// <summary>
    /// IObserver Override
    /// </summary>
    public void OnCompleted()
    {
        Console.WriteLine("OnCompleted");
    }

    /// <summary>
    /// IObserver Override
    /// </summary>
    public void OnError(Exception error)
    {
        Console.WriteLine($"OnError : {error}");
    }

    /// <summary>
    /// IObserver Override
    /// </summary>
    public void OnNext(KeyValuePair<string, object> value)
    {
        lock (this.Activities)
        {
            // Check for disposal
            if (this.Subscriptions == null) return;

            string startSuffix = ".Start";
            string stopSuffix = ".Stop";
            string exceptionSuffix = ".Exception";

            if (Activity.Current == null)
            {
                return;
            }

            if (value.Key.EndsWith(startSuffix))
            {
                this.Activities.Add(Activity.Current);
            }
            else if (value.Key.EndsWith(stopSuffix) || value.Key.EndsWith(exceptionSuffix))
            {
                foreach (Activity activity in this.Activities)
                {
                    if (activity.Id == Activity.Current.Id)
                    {
                        Console.WriteLine($"    Activity Name: {activity.DisplayName}");
                        Console.WriteLine($"    Activity Operation Name: {activity.OperationName}");
                        Console.WriteLine($"    Activity Time: {activity.Duration}");
                        foreach (KeyValuePair<string, string?> actualTag in activity.Tags)
                        {
                            if (this.TargetTags.Contains(actualTag.Key))
                            {
                                Console.WriteLine($"    {actualTag.Key} ==> {actualTag.Value}");
                            }
                        }
                        Console.WriteLine();
                        return;
                    }
                }
            }
        }
    }

    /// <summary>
    /// IObserver Override
    /// </summary>
    public void OnNext(DiagnosticListener value)
    {
        if (value.Name == this.diagnosticSourceName && this.Subscriptions != null)
        {
            Console.WriteLine($"CustomDiagnosticAndEventListener : OnNext : {value.Name}");
            lock (this.Activities)
            {
                this.Subscriptions?.Add(value.Subscribe(this));
            }
        }
    }

    public override void Dispose()
    {
        Console.WriteLine("CustomDiagnosticAndEventListener : Dispose");
        base.Dispose();

        if (this.Subscriptions == null)
        {
            return;
        }

        ConcurrentBag<IDisposable> subscriptions;
        lock (this.Activities)
        {
            subscriptions = this.Subscriptions;
            this.Subscriptions = null;
        }

        foreach (IDisposable subscription in subscriptions)
        {
            subscription.Dispose(); // Dispose of DiagnosticListener subscription
        }

        foreach (Activity activity in this.Activities)
        {
            activity.Dispose(); // Dispose of Activity
        }
    }
}

Expected behavior Can get the activity

Actual behavior Can't get the activity. Only when the source name be changed to "Azure.Cosmos", can I get the activity.

Environment summary SDK Version: 3.40.0 OS Version: windows

Additional context I have an additional two more questions about this feature:

  1. Is it possible that it not emit any diagnostic data in all cases? For now I see the doc says if exceed the threshold it will send diagnostic: https://github.com/Azure/azure-cosmos-dotnet-v3/blob/5994b1608bfb9aa47e925e958613e156eddd9b97/Microsoft.Azure.Cosmos/src/CosmosClientTelemetryOptions.cs#L30C9-L35C152

  2. Will there be any performance impact if we set "DisableDistributedTracing" to true?

blankor1 commented 2 weeks ago

Hi, any update for this?

Also I have two more questions for this code sample: https://github.com/Azure/azure-cosmos-dotnet-v3/blob/20d678b60d8a76ae9f551bf590b1a798c62dd72a/Microsoft.Azure.Cosmos.Samples/Usage/CustomDiagnosticAndEventListener/CustomDiagnosticAndEventListener.cs#L22C9-L23C69

  1. As it already use ConcurrentBag, why we still need to lock the Activites?
  2. The activity seems only be added not removed? https://github.com/Azure/azure-cosmos-dotnet-v3/blob/20d678b60d8a76ae9f551bf590b1a798c62dd72a/Microsoft.Azure.Cosmos.Samples/Usage/CustomDiagnosticAndEventListener/CustomDiagnosticAndEventListener.cs#L68
blankor1 commented 1 week ago

Hi, gentle ping on this. Do we have any update for this?

sourabh1007 commented 1 week ago

1. Is it possible that it not emit any diagnostic data in all cases? For now I see the doc says if exceed the threshold it will send diagnostic: https://github.com/Azure/azure-cosmos-dotnet-v3/blob/5994b1608bfb9aa47e925e958613e156eddd9b97/Microsoft.Azure.Cosmos/src/CosmosClientTelemetryOptions.cs#L30C9-L35C152

Ref. (this)[https://github.com/Azure/azure-cosmos-dotnet-v3/blob/179d9a4930dcec328c6a37745106e976bdc99a4f/Microsoft.Azure.Cosmos/src/Telemetry/OpenTelemetry/CosmosDbEventSource.cs#L38], there are 4 cases when diagnostics would be generated: a) Latency Threshold b) RequestCharge Threshold c) Payload Size Threshold d) Exceptions But you should have Event Level as "Warning" should be enabled. Default thresholds are: https://github.com/Azure/azure-cosmos-dotnet-v3/blob/179d9a4930dcec328c6a37745106e976bdc99a4f/Microsoft.Azure.Cosmos/src/CosmosThresholdOptions.cs#L13 These status codes are considered as successful status code: https://github.com/Azure/azure-cosmos-dotnet-v3/blob/179d9a4930dcec328c6a37745106e976bdc99a4f/Microsoft.Azure.Cosmos/src/Telemetry/OpenTelemetry/Filters/DiagnosticsFilterHelper.cs#L62

2. Will there be any performance impact if we set "DisableDistributedTracing" to true?

When you enable this feature, there would be minor performance impact but it should not affect you. By setting this flag as true, you are actually disabling this feature.

sourabh1007 commented 1 week ago

Hi, any update for this?

Also I have two more questions for this code sample: https://github.com/Azure/azure-cosmos-dotnet-v3/blob/20d678b60d8a76ae9f551bf590b1a798c62dd72a/Microsoft.Azure.Cosmos.Samples/Usage/CustomDiagnosticAndEventListener/CustomDiagnosticAndEventListener.cs#L22C9-L23C69

  1. As it already use ConcurrentBag, why we still need to lock the Activites?
  2. The activity seems only be added not removed? https://github.com/Azure/azure-cosmos-dotnet-v3/blob/20d678b60d8a76ae9f551bf590b1a798c62dd72a/Microsoft.Azure.Cosmos.Samples/Usage/CustomDiagnosticAndEventListener/CustomDiagnosticAndEventListener.cs#L68

This code is for demonstration purpose only, just for reference purpose, you can do whatever suits you in your application. a) Yes, you are right, you can either use ConcurrentBag or lock the Activities. b) It totally depends on your requirement, if you want to collect the activities and use it in certain way (or remove it)

sourabh1007 commented 1 week ago

But when I use stable version 3.40.0, the source name seems changed to "Azure.Cosmos". If I still use "Azure.Cosmos.Operation", I can not get the related activity in the listener. Yes, you are right, I am looking into this, I will update you shortly on this.