open-telemetry / opentelemetry-dotnet

The OpenTelemetry .NET Client
https://opentelemetry.io
Apache License 2.0
3.22k stars 764 forks source link

[bug] Sampler does not work - passes all traces #5849

Closed ep-workai closed 1 month ago

ep-workai commented 1 month ago

Package

OpenTelemetry

Package Version

    <PackageVersion Include="OpenTelemetry.Exporter.Console" Version="1.9.0" />
    <PackageVersion Include="OpenTelemetry.Exporter.OpenTelemetryProtocol" Version="1.9.0" />
    <PackageVersion Include="OpenTelemetry.Extensions.Hosting" Version="1.9.0" />
    <PackageVersion Include="OpenTelemetry.Instrumentation.AspNet" Version="1.8.0-beta.3" />
    <PackageVersion Include="OpenTelemetry.Instrumentation.SqlClient" Version="1.9.0-beta.1" />
    <PackageVersion Include="OpenTelemetry.Instrumentation.Http" Version="1.9.0" />
    <PackageVersion Include="OpenTelemetry.Instrumentation.AspNet.TelemetryHttpModule" Version="1.9.0-beta.1" />
    <PackageVersion Include="OpenTelemetry.Instrumentation.Hangfire" Version="1.6.0-beta.1" />
    <PackageVersion Include="OpenTelemetry.Instrumentation.EntityFrameworkCore" Version="1.0.0-beta.12" />
    <PackageVersion Include="Azure.Monitor.OpenTelemetry.AspNetCore" Version="1.3.0-beta.1" />
    <PackageVersion Include="Azure.Monitor.OpenTelemetry.Exporter" Version="1.4.0-beta.1" />

Runtime Version

net8.0

Description

When sampler is included in a pipeline it should process (pass) only traces which are sampled (SamplingDecision.RecordAndSample).

In my case even if sampler has explicit drop decision traces are sent to exporter and then to storage. I've used several exporters: zipkin, console, otlp (to Grafana Tempo) and always on custom sampler or AlwaysOffSampler traces are sent at 100%.

Steps to Reproduce

  1. Make minimal example:
 AddDebugTraceProcessor(myAppBuilder, out var debugTraceProcessor);

        TracerProviderBuilder tracing = Sdk.CreateTracerProviderBuilder();

        tracing.AddSource([serviceInfo.Name, .. myAppBuilder.ActivitySources])
            .SetSampler(new AlwaysOffSampler());

        tracing.AddProcessor<TailSamplingProcessor>();

        if (myAppBuilder.Services.Any(
                serviceDescriptor => serviceDescriptor.ServiceType == typeof(DebugTraceProcessor)))
        {
            tracing.AddProcessor(debugTraceProcessor);
        }

        tracing
            .AddAspNetCoreInstrumentation(
                opt =>
                {
                    opt.RecordException = true;
                    opt.Filter = AspNetCoreFilter;
                })
            .AddHttpClientInstrumentation(x => x.RecordException = true)
            .AddEntityFrameworkCoreInstrumentation()
            .AddSqlClientInstrumentation(
                x =>
                {
                    x.RecordException = true;
                    x.SetDbStatementForText = true;
                    x.SetDbStatementForStoredProcedure = true;
                })
            .AddHangfireInstrumentation(x => x.RecordException = true)
            .SetErrorStatusOnException()
            .AddConsoleExporter()
            .AddOtlpExporter(
                x =>
                {
                    x.Protocol = OtlpExportProtocol.HttpProtobuf;
                    x.Endpoint = new Uri("http://localhost:4318/v1/traces");
                    x.ExportProcessorType = ExportProcessorType.Simple;
                });

        return tracing.Build();

Expected Result

Not all traces are sent if sampling is below 1.0 or turned off (0.0)

Actual Result

All traces are sent ignoring sampler.

Additional Context

Breakpoint hits sampler, decision is set to drop, but then traces are sent.

cijothomas commented 1 month ago

Can you a minimal-reproducible example? You can take the hello world example https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/docs/trace/getting-started-console/Program.cs and change its sampler to AlwaysOff. Without that, it is hard to investigate as you have a lot of things enabled/configured.

ep-workai commented 1 month ago

The code in the example works as expected. I've removed now all instrumentations and processors. Activities are appearing in my console with AlwaysOffSampler.

The same is for both Sdk-style and Microsoft DI extension style configuration.

cijothomas commented 1 month ago

Can you share a minimal repro, either using sdk style or DI style.? Specifically, the program.cs file and the csproj (or list the name and version of each packages required), showing the absolute minimal to reproduce the issue.

It'd be a p0 bug if sampled out activities appear in console exporter (or any exporter!) Pretty sure there is something else going on, and need a repro to spot what that is!

ep-workai commented 1 month ago

Here is it. Like I've mentioned - removing all processor, instrumentations etc. didn't help.

Extensions is called in main program class. I've already removed Microsoft exporter, because with it breakpoint didn't even hit sampler method.

using Azure.Core;
using Azure.Core.Pipeline;
using Azure.Monitor.OpenTelemetry.AspNetCore;
using Azure.Monitor.OpenTelemetry.Exporter;
using Microsoft.AspNetCore.Http;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;
using OpenTelemetry;
using OpenTelemetry.Exporter;
using OpenTelemetry.Logs;
using OpenTelemetry.Metrics;
using OpenTelemetry.Resources;
using OpenTelemetry.Trace;
using my.Logging;
using my.Logging.Filters;
using my.Logging.Processors;
using my.Shared.Configuration;

namespace my.Shared.Extensions;

public static partial class ImyAppBuilderExtensions
{
    [KeepReference]
    private static TracerProvider _tracerProvider;

    private static void AddOpenTelemetry(ImyAppBuilder @this,
        (string Name, Version Version, string InstanceName, string Namespace) serviceInfo)
    {
        var sampling = @this.Configuration.GetValue<double?>(LoggingConfiguration.SamplingKey);

        _tracerProvider = CreateTraceProvider(@this, serviceInfo, sampling);

        OpenTelemetryBuilder openTelemetryBuilder = @this.Services.AddOpenTelemetry()
            .ConfigureResource(
                resource => resource.Clear()
                    .AddService(
                        serviceInfo.Name,
                        serviceInfo.Namespace,
                        serviceInfo.Version.ToString(),
                        false,
                        serviceInfo.InstanceName));

        if (@this.Configuration.GetValue<bool>(LoggingConfiguration.MetricsEnabled))
        {
            openTelemetryBuilder.WithMetrics(
                metrics => metrics
                    .AddMeter(serviceInfo.Name)
                    .AddHttpClientInstrumentation()
                    .AddAspNetCoreInstrumentation()
                    .AddConsoleExporter()
                    .AddAzureMonitorMetricExporter());
        }

        openTelemetryBuilder.UseAzureMonitor(
            x =>
                x.ConnectionString =
                    @this.Configuration.GetValue<string>(LoggingConfiguration.AzureMonitorConnectionStringKey));

        AddDebugLogProcessor(@this, out var debugLogProcessor);

        @this.LoggingBuilder.AddOpenTelemetry(
                options =>
                {
                    options.IncludeScopes = true;
                    options.IncludeFormattedMessage = true;
                    options.ParseStateValues = true;

                    options
                        .SetResourceBuilder(
                            ResourceBuilder.CreateEmpty()
                                .AddService(
                                    serviceInfo.Name,
                                    serviceInfo.Namespace,
                                    serviceInfo.Version.ToString(),
                                    false,
                                    serviceInfo.InstanceName))
                        .AddProcessor(new SensitiveDataEraserProcessor());

                    if (@this.Services.Any(
                            serviceDescriptor => serviceDescriptor.ServiceType == typeof(DebugLogProcessor)))
                    {
                        options.AddProcessor(debugLogProcessor);
                    }

                    options
                        .AddConsoleExporter()
                        .AddAzureMonitorLogExporter(
                            y => y.ConnectionString =
                                @this.Configuration.GetValue<string>(
                                    LoggingConfiguration.AzureMonitorConnectionStringKey));
                })
            .AddFilter(new GlobalLogFilter().Filter);
    }

    private static TracerProvider CreateTraceProvider(ImyAppBuilder myAppBuilder,
        (string Name, Version Version, string InstanceName, string Namespace) serviceInfo,
        double? sampling)
    {
        AddDebugTraceProcessor(myAppBuilder, out var debugTraceProcessor);

        TracerProviderBuilder tracing = Sdk.CreateTracerProviderBuilder();

        tracing.AddSource([serviceInfo.Name, .. myAppBuilder.ActivitySources])
            .SetSampler(new AlwaysOffSampler());

        tracing.AddProcessor<TailSamplingProcessor>(); 

        if (myAppBuilder.Services.Any(
                serviceDescriptor => serviceDescriptor.ServiceType == typeof(DebugTraceProcessor)))
        {
            tracing.AddProcessor(debugTraceProcessor);
        }

        tracing
            .AddAspNetCoreInstrumentation(
                opt =>
                {
                    opt.RecordException = true;
                    opt.Filter = AspNetCoreFilter;
                })
            .AddHttpClientInstrumentation(x => x.RecordException = true)
            .AddEntityFrameworkCoreInstrumentation()
            .AddSqlClientInstrumentation(
                x =>
                {
                    x.RecordException = true;
                    x.SetDbStatementForText = true;
                    x.SetDbStatementForStoredProcedure = true;
                })
            .AddHangfireInstrumentation(x => x.RecordException = true)
            .SetErrorStatusOnException()
            .AddConsoleExporter()
            .AddOtlpExporter(
                x =>
                {
                    x.Protocol = OtlpExportProtocol.HttpProtobuf;
                    x.Endpoint = new Uri("http://localhost:4318/v1/traces");
                    x.ExportProcessorType = ExportProcessorType.Simple;
                });

        return tracing.Build();

        void ConfigureAzureMonitorExporter(AzureMonitorExporterOptions options)
        {
            options.Diagnostics.IsDistributedTracingEnabled = true;

            options.ConnectionString = 
                myAppBuilder.Configuration.GetValue<string>(
                    LoggingConfiguration.AzureMonitorConnectionStringKey);

            options.RetryPolicy = new RetryPolicy(
                6,
                DelayStrategy.CreateExponentialDelayStrategy(
                    TimeSpan.FromMilliseconds(50),
                    TimeSpan.FromSeconds(30)));
        }
    }

    private static void AddDebugLogProcessor(ImyAppBuilder @this,
        out DebugLogProcessor? debugLogProcessor)
    {
        if (@this.Environment.IsDevelopment())
        {
            debugLogProcessor = new DebugLogProcessor();
            @this.Services.AddSingleton(debugLogProcessor);
        }
        else
        {
            debugLogProcessor = null;
        }
    }

    private static void AddDebugTraceProcessor(ImyAppBuilder @this,
        out DebugTraceProcessor? debugTraceProcessor)
    {
        if (@this.Environment.IsDevelopment())
        {
            debugTraceProcessor = new DebugTraceProcessor();
            @this.Services.AddSingleton(debugTraceProcessor);
        }
        else
        {
            debugTraceProcessor = null;
        }
    }

    private static bool AspNetCoreFilter(HttpContext context)
    {
        if (!context.Request.Path.HasValue || context.Request.Path == PathString.Empty)
        {
            return true;
        }

        // ignore health-checks
        if (context.Request.Path.Value.Contains($"/{HealthCheckConfiguration.UrlPath}/"))
        {
            return false;
        }

        // ignore signalr hub requests
        if (((string[]) ["signalr", "notificationsHub", "socialHub"]).Any(
                x => context.Request.Path.Value.Contains(x)))
        {
            return false;
        }

        return true;
    }
}

Main

using System.Reflection;
using Microsoft.EntityFrameworkCore;
using my.Logging;
using my.Logging.Poc;
using my.Shared.Extensions;

var builder = WebApplication.CreateBuilder(args);
var myBuilder = builder.CreatemyBuilder();

myBuilder
    .AddAssemblies(
    [
        Assembly.GetAssembly(typeof(TestController))
    ]) // optional to provide external assemblies plugins for configuration, should be first
    .AddCore()
    .AddLogging()
    .AddDependencyInjection()
    .AddAuthentication()
    .AddMvc()
    .AddApiExplorer()
    .AddApiVersioning(
        x => x.ApiVersioningBuilder.AddApiExplorer(
            options =>
            {
                options.GroupNameFormat = "'v'VVV";
                options.SubstituteApiVersionInUrl = true;
            }));

myBuilder.Services.AddSingleton(ActivitySourceFactory.Create());
myBuilder.Services.AddDbContextFactory<MyDbContext>(x => x.UseInMemoryDatabase("in-memory"));
myBuilder.Services.AddHttpClient();

var app = builder.Build();

app.UseCore()
    .UseApiExplorer();

await app.RunAsync();

public partial class Program
{
}
cijothomas commented 1 month ago

This is far from a minimal example! See the guidance shared in an external link : https://stackoverflow.com/help/minimal-reproducible-example Start with https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/docs/trace/getting-started-console/Program.cs and confirm that AlwaysOffSampler works in that. Then incrementally add changes to it, until it breaks!

There is a good chance AzureMonitor is messing up samplers, but hard to confirm without minimal example. (You can try removing it)

ep-workai commented 1 month ago

Seems I solved this.

Invoking openTelemetryBuilder.UseAzureMonitor(...) seems overrides whole pipeline and removes/replaces sampling :/

Closing the issue. Thanks.