dotnet / SqlClient

Microsoft.Data.SqlClient provides database connectivity to SQL Server for .NET applications.
MIT License
859 stars 288 forks source link

Add new `SqlCommand` Tag property for arbitrary user-provided data #2931

Closed perlun closed 3 weeks ago

perlun commented 3 weeks ago

Extracted from https://github.com/dotnet/SqlClient/issues/2210#issuecomment-2436985362, per request:

Semi-related to this, but the SqlCommand class (both the deprecated System.Data.SqlClient.SqlCommand and the now-relevant Microsoft.Data.SqlClient.SqlCommand) are a bit hard to use with OpenTelemetry in the sense that it seems virtually impossible to add a 3rd party "query name" tag to the instantiated SqlCommand. 🤔 This kind of information would be incredibly useful to be able to include in the distributed tracing data, to distinguish different queries from each other.

The reasons why I claim this is impossible:

If either one of these options were available, you could then easily use the OpenTelemetry.Instrumentation.SqlClient-provided Enrich() method, to set arbitrary tags in the tracing data being submitted. We tried setting a dummy Parameter with a user-defined name of the query, but unfortunately this can cause undesired breakage when running certain SQL queries.


Sorry for hijacking the issue quite a bit (...) but I guess what I'm stretching for is:

  1. Does anyone (perhaps you, @cijothomas?) have any ideas on how to workaround this short-term? I thought about a Java:ish approach of using a thread-local variable for this, but unsure if we can guarantee that the Enrich() lambda runs on the same thread as the code creating the SqlCommand in the first place, because of the async nature of modern .NET code.
  2. When investigating this issue in the long run, please take this into consideration ("make it possible to provide a user-defined query name") since it'll make the tracing much more valuable and useful.
perlun commented 3 weeks ago

Suggestions were provided in https://github.com/dotnet/SqlClient/issues/2210#issuecomment-2437058894 and https://github.com/dotnet/SqlClient/issues/2210#issuecomment-2437204832:

@perlun, ConditionalWeakTable<SqlCommand, string> would work, but I don't know how much performance that would cost.

It would work but because SqlCommand are short lifetime objects (usually) it would cause a lot of traffic and be perf concern as you point out. Adding an identifier property (call it Tag?) to a command isn't a difficult thing to do and would likely be approved. The problem is getting it into the diagnostic output or waiting for the full implementation of the open telemetry

Here is my reply:

Thanks @KalleOlaviNiemitalo, appreciated! 🙏

It would work but because SqlCommand are short lifetime objects (usually) it would cause a lot of traffic and be perf concern as you point out.

It (ConditionalWeakTable) could actually be an option in our case, despite not a perfect solution. Might give this a try as an interim solution at least.

Adding an identifier property (call it Tag?) to a command isn't a difficult thing to do and would likely be approved.

Thank you for that @Wraith2 - let's give it a try: https://github.com/dotnet/SqlClient/pull/2925

The problem is getting it into the diagnostic output or waiting for the full implementation of the open telemetry

That's actually not a problem, at least with the current approach taken by the OpenTelemetry.Instrumentation.SqlClient-tracing-related code, because it's Enrich-property is defined like this. 👇

    /// <summary>
    /// Gets or sets an action to enrich an <see cref="Activity"/> with the
    /// raw <c>SqlCommand</c> object.
    /// </summary>
    /// <remarks>
    /// <para><b>Enrich is only executed on .NET and .NET Core
    /// runtimes.</b></para>
    /// The parameters passed to the enrich action are:
    /// <list type="number">
    /// <item>The <see cref="Activity"/> being enriched.</item>
    /// <item>The name of the event. Currently only <c>"OnCustom"</c> is
    /// used but more events may be added in the future.</item>
    /// <item>The raw <c>SqlCommand</c> object from which additional
    /// information can be extracted to enrich the <see
    /// cref="Activity"/>.</item>
    /// </list>
    /// </remarks>
    public Action<Activity, string, object>? Enrich { get; set; }

Since we have access to the raw SqlCommand there, it's trivial to then "pass it on" into the Activity object which gets sent to the OpenTelemetry collector:

var cmd = (SqlCommand)command;
activity.SetTag("query.name", cmd.Tag); // Depending on the new property not yet available in SqlCommand

Of course, for the full (SqlClient-"native") OpenTelemetry implementation, things might be a completely different beast, I'm just talking about "how could we make this work at all right now" scenario right now.

perlun commented 3 weeks ago

(Thanks for suggesting the AsyncLocal approach, @vonzshik. This worked quite nicely for our use case for now. 🌟 But getting a proper field in SqlCommand will be even nicer of course, if it turns out to be doable)

roji commented 3 weeks ago

I actually think @vonzshik's AsyncLocal suggestion is probably the right way forward here, rather than adding stuff on SqlCommand. A single field wouldn't be sufficient (what if I want to add two tags), and there should be a high bar to adding public surface to a type such as SqlCommand...

KalleOlaviNiemitalo commented 3 weeks ago

I had doubts about whether SqlClient reliably flows ExecutionContext to all methods in which it reports events to a DiagnosticSource; but apparently this is implemented by using Task.ContinueWith. I didn't find any specific tests for that, though.

roji commented 3 weeks ago

@KalleOlaviNiemitalo sure, though if there's an issue on that front, that should simply be fixed, rather than introducing another mechanism for flowing tags to tracing.

benrr101 commented 3 weeks ago

Thanks @perlun for raising this suggestion - I can see where there might be value in opening up tagging information for telemetry purposes. We are currently working on migration of telemetry to OpenTelemetry and want to make sure that this request gets properly integrated into that work. We are also curious what approach npgsql might be taking so we can follow suit (@roji).

Since we want to get a good strategy worked out before implementing it, we're going to move this to a discussion for now.