jaegertracing / jaeger-client-csharp

🛑 This library is DEPRECATED!
https://jaegertracing.io/
Apache License 2.0
302 stars 67 forks source link

RemoteReporter with HttpSender always ignores spans on flush #162

Closed bastijnv closed 4 years ago

bastijnv commented 4 years ago

Requirement - what kind of business use case are you trying to solve?

Send traces directly to the Jaeger Collector over HTTP using the RemoteReporter with the HttpSender.

Problem - what in Jaeger blocks you from solving the requirement?

As resolution of issue #154 . the "ot-ignore" property is added by default as custom property to the HttpSender. In the default setup a singleton tracer with the RemoteReporter and HttpSender is configured. The reporter is used to report custom spans as well as the internal spans. They all end up in a single list and a single batch which is flushed every interval.

The issue describes the same setup as resolved by the fix but I fail to understand how this works. From code and debugging the single HttpSender is batching and sending both the spans you do want to trace, and the ones you do not want to trace. Right now all spans are ignored due to the custom ignore pattern being added as resolution of issue #154 . This prevents me from using the HttpSender through "JAEGER_ENDPOINT" as all spans created will be ignored and never actually sent to the collector.

Code sample

dotnet new webapi

Update Startup.cs:

// This method gets called by the runtime. Use this method to add services to the container.
        public void ConfigureServices(IServiceCollection services)
        {
            ....

            // Adds the Jaeger Tracer.
            services.AddSingleton<ITracer>(serviceProvider =>
            {
                string serviceName = serviceProvider.GetRequiredService<IHostingEnvironment>().ApplicationName;

                // This will log to a default localhost installation of Jaeger.
                var tracer = GetOrCreateOpenTracer(serviceName, LoggerFactory);

                // Allows code that can't use DI to also access the tracer.
                GlobalTracer.Register(tracer);

                return tracer;
            });

            ....
        }

        /// <summary>
        /// Get the registered OpenTracing Tracer, or create a new one and register this.
        /// </summary>
        /// <param name="serviceName">The service name under which to trace items</param>
        /// <param name="loggerFactory">Optinally, a logger factory to use</param>
        /// <returns></returns>
        public static OpenTracing.ITracer GetOrCreateOpenTracer(
            string serviceName,
            ILoggerFactory loggerFactory = null
        ) {
            if (GlobalTracer.IsRegistered()) {
                return GlobalTracer.Instance;
            } else {
                .....
               string endpoint = Environment.GetEnvironmentVariable("JAEGER_ENDPOINT")
               var configurationBased = Jaeger.Configuration.FromEnv(loggerFactory);
               // below is also the default when JAEGER_ENDPOINT env var is set but just to make explicit
                var remoteReporter = new RemoteReporter.Builder()
                   .WithLoggerFactory(loggerFactory) // optional, defaults to no logging
                   .WithMaxQueueSize(100)            // optional, defaults to 100
                   .WithFlushInterval(TimeSpan.FromSeconds(1)) // optional, defaults to TimeSpan.FromSeconds(1)
                   .WithSender(new HttpSender(endpoint))   // optional, defaults to UdpSender("localhost", 6831, 0)
                   .Build();
                var loggingReporter = new LoggingReporter(loggerFactory);
                var compositeReporter = new CompositeReporter(remoteReporter, loggingReporter);
                var sampler = new ConstSampler(true);

                ITracer tracer = 
                  configurationBased.GetTracerBuilder().WithSampler(sampler).WithReporter(compositeReporter).Build();

                return tracer;
            }
        }

Update ValueController.cs

private readonly ITracer _tracer;

        public ValuesController(ITracer tracer)
        {
            _tracer = tracer;
        }

        // GET api/values
        [HttpGet]
        public async Task<ActionResult<IEnumerable<string>>> Get()
        {
            var builder = _tracer.BuildSpan("test");
            using (var scope = builder.StartActive(true))
            {
                var span = scope.Span;
                await Task.Delay(50);
                return new string[] { span.Context.TraceId, span.Context.SpanId };
            }

            return new string[] { "failed", "to trace" };

        }

Proposal - what do you suggest to solve the problem or improve the existing situation?

If my understanding is correct the internal spans cannot be sent by the same reporter or the reporter should keep two lists and add the ignore patterns to only those requests where the internal spans are reported.

Might also be I misunderstand or misconfigured the Jaeger client.

Any open questions to address

N/a.

Falco20019 commented 4 years ago

@bocharovf Can you have a look and tell what you do differently? I don’t test it with netcore since it’s two different projects. If it introduces a bug there for regular traces without possible resolution, I need to revert #156 and have to advise you to use the workaround described by them as this is a bug from netcore and not this lib.

Falco20019 commented 4 years ago

@bastijnv As far as I can tell, this should only ignore the calls for tracing that go to an netcore application (like internal ones) but not your request. The call to UDP endpoint should still come through. So I’m not sure that the problem arises from the location and change you mention.

What value have you used for JAEGER_ENDPOINT? Are you using it in the correct form including the http scheme? I will give it a look later tonight once I have access to a PC.

I‘m also missing the services.AddOpenTracing(); call. You only add the Singleton but do not register netcore to your pipeline. Or are you on purpose only want to use Jaeger directly in combination with WebAPI?

bastijnv commented 4 years ago

@Falco20019 the thing is I'm not sending to the UDP endpoint, I'm sending directly to the HTTP endpoint since UDP routing is not available in my environment.

In localhost setup the JAEGER_ENDPOINT is http://localhost:14268/api/traces (exactly as in the documentation in the Readme).

I verified it is that piece of code of the issue by creating a CustomHttpSender derived from ThriftSender with the exact same code as HttpSender but sending an empty custom ignore collection. That setup works as expected (but includes the ones the issue wants to exclude).

//edit AddOpenTracing call is there, or it should be. I think I just didn't copy here. Let me verify once I reach my laptop later tonight. Could be that I accidentally removed it when stripping down to a bare minimal solution to investigate why it wasn't sending. (I also resolved some issues on my end before I reached this point).

//edit2 Double checked. AddOpenTracing() call was present in the code I took away from the snippet. Sorry for the confusion.

// Use "OpenTracing.Contrib.NetCore" to automatically generate spans for ASP.NET Core, Entity Framework Core, ...
            // See https://github.com/opentracing-contrib/csharp-netcore for details.
            services.AddOpenTracing();

// Edit3 Confirming once more I notice it now functions as intended when using the regular HttpSender. However, this is on my MacBook. Different net core versions. Let me run some tests why it went wrong on my windows machine. I'll update here once I verify that windows machine is still failing. Shouldn't matter though as everything is compiled and run in docker containers.

//Edit4 In my containers everything works as expected. However, in my local windows environment I switched to .NET CORE 3.1 already which is where the error appears. Might have something todo with the switch/support for OpenTelemetry since that version?

So:

dbug: OpenTracing.Contrib.NetCore.CoreFx.HttpHandlerDiagnostics[0]
      Ignoring Request http://localhost:14268/api/traces?format=jaeger.thrift

Unfortunately do not have the time to debug the different paths in the Jaeger C# client right now.

Falco20019 commented 4 years ago

I just tried out your exact code with a fresh build as netcoreapp3.1 with 3.1.101 using IISExpress and Kestrel and it's not occurring on my windows machine.

info: Jaeger.Reporters.LoggingReporter[0]
      Span reported: 36874dc3c76b074e:36874dc3c76b074e:0:1 - test

It's correct that it should ignore calls to http://localhost:14268/api/traces?format=jaeger.thrift. This is exactly to avoid loops. But you should also see a log entry as the one I just copied.

bastijnv commented 4 years ago

Sorry, the logging reporter was for me to verify at least spans were ended. For me it is not correct the call to http://localhost:14268/api/traces?format=jaeger.thrift is ignored. In the list of spans that goes to that endpoint is my custom "test" trace of the code above. That is now ignored and no longer send to my collector which stores it in elasticsearch for the jaeger-query to retrieve.

The code is now setup that no custom calls can be made anymore through the HttpSender over HTTP protocol as it will always ignore all spans if I'm reading the code correctly.

//additional information Some solutions we deploy in an environment where we cannot use the agent as it only supports GRPC/UDP/TChannel which is not supported in some of our environments. As a workaround I intended to send our spans directly over HTTP to the collector.

Falco20019 commented 4 years ago

You misread the output. It does not ignore calls to that endpoint. It just ignores the self reference.

You also produce a minor problem by mixing Jaeger.Configuration with regular Tracer.Builder usage. You should not mix those two and therefore use ReportingConfiguration and SamplerConfiguration instead of Sampler directly. Otherwise, you end up with an RemoteControllerSampler being created but not disposed. But this is only in your explicit test here. As you intend to use the regular FromEnv call, this is just informational and has also no influence but a nasty additional log warning.

This is the complete, updated Startup.cs that I used (with and without using OpenTracing.NetCore) and both work correctly for me. I also tried your exact code (with and without OpenTracing.NetCore) also with success.

using System;
using Jaeger;
using Jaeger.Reporters;
using Jaeger.Samplers;
using Jaeger.Senders;
using Microsoft.AspNetCore.Builder;
using Microsoft.AspNetCore.Hosting;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;
using OpenTracing;
using OpenTracing.Util;

namespace test
{
    public class Startup
    {
        public Startup(IConfiguration configuration)
        {
            Configuration = configuration;
        }

        public IConfiguration Configuration { get; }

        // This method gets called by the runtime. Use this method to add services to the container.
        public void ConfigureServices(IServiceCollection services)
        {
            services.AddControllers();

            services.AddLogging(configure => configure.AddConsole());

            services.AddSingleton(serviceProvider =>
            {
                var serviceName = serviceProvider.GetRequiredService<IWebHostEnvironment>().ApplicationName;
                var loggerFactory = serviceProvider.GetRequiredService<ILoggerFactory>();

                // This will log to a default localhost installation of Jaeger.
                var tracer = GetOrCreateOpenTracer(serviceName, loggerFactory);

                // Allows code that can't use DI to also access the tracer.
                GlobalTracer.Register(tracer);

                return tracer;
            });

            //services.AddOpenTracing();
        }

        /// <summary>
        /// Get the registered OpenTracing Tracer, or create a new one and register this.
        /// </summary>
        /// <param name="serviceName">The service name under which to trace items</param>
        /// <param name="loggerFactory">Optinally, a logger factory to use</param>
        /// <returns></returns>
        public static OpenTracing.ITracer GetOrCreateOpenTracer(
            string serviceName,
            ILoggerFactory loggerFactory
        ) {
            if (GlobalTracer.IsRegistered()) {
                return GlobalTracer.Instance;
            }

            var endpoint = "http://localhost:14268/api/traces";
            var remoteReporter = new RemoteReporter.Builder()
                .WithLoggerFactory(loggerFactory) // optional, defaults to no logging
                .WithMaxQueueSize(100)            // optional, defaults to 100
                .WithFlushInterval(TimeSpan.FromSeconds(1)) // optional, defaults to TimeSpan.FromSeconds(1)
                .WithSender(new HttpSender(endpoint))   // optional, defaults to UdpSender("localhost", 6831, 0)
                .Build();
            var loggingReporter = new LoggingReporter(loggerFactory);
            var compositeReporter = new CompositeReporter(remoteReporter, loggingReporter);
            var sampler = new ConstSampler(true);

            ITracer tracer = new Tracer.Builder(serviceName)
                .WithSampler(sampler)
                .WithReporter(compositeReporter)
                .Build();

            return tracer;
        }

        // This method gets called by the runtime. Use this method to configure the HTTP request pipeline.
        public void Configure(IApplicationBuilder app, IWebHostEnvironment env)
        {
            if (env.IsDevelopment())
            {
                app.UseDeveloperExceptionPage();
            }

            app.UseHttpsRedirection();

            app.UseRouting();

            app.UseAuthorization();

            app.UseEndpoints(endpoints =>
            {
                endpoints.MapControllers();
            });
        }
    }
}
bastijnv commented 4 years ago

I checked with your code but the issue remains. I do not fully understand your following remark:

You misread the output. It does not ignore calls to that endpoint. It just ignores the self reference.

Those self reference spans are in the same batch as my custom span. As a batch they seem to be written into the binary format and clubbed into a single request. That entire request is ignore by the HttpSender because of the CustomIgnorePattern added by default/hardcoded in the constructor.

My Jaeger UI stays empty when I run the setup with my code or your code. Whenever the ignore message appears basically. The strange thing is the exact same code works under netcore 2.2.

//updated my mac to netcore 3.1.101. Running there indeed seems to work as you describe. What OS are you testing? Right now, not sure why yet, OSX 10.14.6 (Mojave) with 3.1.101 works. It prints ignore but my UI correctly shows the traces. Windows10, same code, 3.1.101 does not show. Will run some more tests, including the code you shared below.

Falco20019 commented 4 years ago

I am using Windows 10 v1803 (Build 17134.1184), ran with 3.1.101.

Can you try the following to make sure it's not a problem with your environment:

public void ConfigureServices(IServiceCollection services)
{
    services.AddControllers();

    services.AddLogging(configure => configure.AddConsole());

    services.AddSingleton(serviceProvider =>
    {
        var serviceName = serviceProvider.GetRequiredService<IWebHostEnvironment>().ApplicationName;
        var loggerFactory = serviceProvider.GetRequiredService<ILoggerFactory>();

        Environment.SetEnvironmentVariable(Jaeger.Configuration.JaegerServiceName, serviceName);
        Environment.SetEnvironmentVariable(Jaeger.Configuration.JaegerSamplerType, "const");
        Environment.SetEnvironmentVariable(Jaeger.Configuration.JaegerReporterLogSpans, "1");
        Environment.SetEnvironmentVariable(Jaeger.Configuration.JaegerSamplerParam, "1");
        Environment.SetEnvironmentVariable(Jaeger.Configuration.JaegerEndpoint, "http://localhost:14268/api/traces");

        // This will log to a default localhost installation of Jaeger.
        var tracer = Jaeger.Configuration.FromEnv(loggerFactory).GetTracer();

        // Allows code that can't use DI to also access the tracer.
        GlobalTracer.Register(tracer);

        return tracer;
    });

    services.AddOpenTracing();
}

This will also output the used configuration. I get the following output:

info: Jaeger.Configuration[0]
      Initialized Tracer(ServiceName=test, Version=CSharp-0.3.6.0, Reporter=CompositeReporter(Reporters=RemoteReporter(Sender=HttpSender), LoggingReporter(Logger=Microsoft.Extensions.Logging.Logger`1[Jaeger.Reporters.LoggingReporter])), Sampler=ConstSampler(True), IPv4=182721426, Tags=[jaeger.version, CSharp-0.3.6.0], [hostname, MOS1T02859], [ip, 10.228.27.146], ZipkinSharedRpcSpan=False, ExpandExceptionLogs=False, UseTraceId128Bit=False)
info: Microsoft.Hosting.Lifetime[0]
      Now listening on: https://localhost:5001
info: Microsoft.Hosting.Lifetime[0]
      Now listening on: http://localhost:5000
info: Microsoft.Hosting.Lifetime[0]
      Application started. Press Ctrl+C to shut down.
info: Microsoft.Hosting.Lifetime[0]
      Hosting environment: Development
info: Microsoft.Hosting.Lifetime[0]
      Content root path: C:\Users\dejhbk0l\Documents\GitHub\test
info: Jaeger.Reporters.LoggingReporter[0]
      Span reported: dfcd44dd13f261a2:35dcebfffc5835ee:233c7b25db3eb21a:1 - Result ObjectResult
info: Jaeger.Reporters.LoggingReporter[0]
      Span reported: dfcd44dd13f261a2:233c7b25db3eb21a:dfcd44dd13f261a2:1 - Action test.Controllers.WeatherForecastController/Get
info: Jaeger.Reporters.LoggingReporter[0]
      Span reported: dfcd44dd13f261a2:dfcd44dd13f261a2:0:1 - HTTP GET
info: Jaeger.Reporters.LoggingReporter[0]
      Span reported: 6c08d8dce9090e9:1263286cc0946471:23e5ca3847335be2:1 - test
info: Jaeger.Reporters.LoggingReporter[0]
      Span reported: 6c08d8dce9090e9:b639248c8c5fc148:23e5ca3847335be2:1 - Result ObjectResult
info: Jaeger.Reporters.LoggingReporter[0]
      Span reported: 6c08d8dce9090e9:23e5ca3847335be2:6c08d8dce9090e9:1 - Action test.Controllers.ValuesController/Get
info: Jaeger.Reporters.LoggingReporter[0]
      Span reported: 6c08d8dce9090e9:6c08d8dce9090e9:0:1 - HTTP GET

AFAIK, ot-ignore is processed by OpenTracing.NetCore and not the jaeger-collector. So it is always added, independently of the batch to the call made to jaeger-collector. But jaeger-collector only processes the content and does not care about that flag.

bastijnv commented 4 years ago

Ok, found the issue. Seems it was between the keyboard and chair. Apparently I was running two elasticsearch instances locally the entire day and I have been tying everything to the wrong storage endpoint (in the setup where I used HTTP sender, the customer sender I hardcoded to the correct endpoint).

Thanks for all the support and sorry for wasting time. As a plus side I now know the code well enough the understand what happens..

The issue can be rejected/closed.

Falco20019 commented 4 years ago

This explains it :) Glad you were able to fix it. BTW, you do not need the GetOrCreateOpenTracer method with all the checks on GlobalTracer. It's totally enough to do the following, as ASP.NET Core ensures this only get's called once.

services.AddSingleton(serviceProvider =>
{
    var loggerFactory = serviceProvider.GetRequiredService<ILoggerFactory>();

    // This will log to the endpoint configured in the environment.
    var tracer = Jaeger.Configuration.FromEnv(loggerFactory).GetTracer();

    // Allows code that can't use DI to also access the tracer.
    GlobalTracer.Register(tracer);

    return tracer;
});
bastijnv commented 4 years ago

I know, but our actual codebase is completely abstracting away the tracer implementation and the code is more complex there. We accept different tracers for different environments, some 15 years old some new and based on opentracing (for cloud deployments). The method comes from a dependency injection setup that allows us to inject the right type of tracer for the deployment environment. (it is not actually calling that method directly, it calls an abstraction layer to inject a tracer which ultimately adds an OpenTracing tracer, in this case mapped to Jaeger).

I started from that and stripped down :).

Thanks for the help!

Afshin-Ofx commented 4 years ago

@bastijnv you saved my day. Was looking at why in a simple .net core webapi and all-in-one Jaeger I was not able to see the traces in Jaeger UI, which turns out my issue was with UdpSender. Changed that to HttpSender based on your code and It woks :)

allanShady commented 3 years ago

@Falco20019 and @bastijnv I just want to thank you and let everyone know that I fixed a similar issue by adding the composite Reporter.

Before

       public void ConfigureServices(IServiceCollection services)
       {
            services.AddControllers();
            services.AddOpenTracing();
            services.AddSingleton<ITracer>(serviceProvider => 
            {
                var loggerFactory = serviceProvider.GetRequiredService<ILoggerFactory>();
                var serviceName = serviceProvider.GetRequiredService<IWebHostEnvironment>().ApplicationName;

                var reporter = new RemoteReporter.Builder()
                .WithLoggerFactory(loggerFactory)
                .WithSender(new UdpSender(string.Empty, 6831, 0))
                .Build();

                var tracer = new Tracer.Builder(serviceName)
                .WithLoggerFactory(loggerFactory)
                .WithSampler(new ConstSampler(true))
                .WithReporter(reporter)
                .Build();

                if(!GlobalTracer.IsRegistered()) GlobalTracer.Register(tracer);

                return tracer;

            });
            services.AddHealthChecks();
       }

After

        public void ConfigureServices(IServiceCollection services)
        {
            services.AddControllers();
            services.AddOpenTracing();
            services.AddSingleton<ITracer>(serviceProvider => 
            {
                var loggerFactory = serviceProvider.GetRequiredService<ILoggerFactory>();
                var serviceName = serviceProvider.GetRequiredService<IWebHostEnvironment>().ApplicationName;

                var remoteReporter = new RemoteReporter.Builder()
                .WithLoggerFactory(loggerFactory)
                .WithSender(new UdpSender(string.Empty, 6831, 0))
                .Build();

                var loggingReporter = new LoggingReporter(loggerFactory);
                var compositeReporter = new CompositeReporter(remoteReporter, loggingReporter);

                var tracer = new Tracer.Builder(serviceName)
                .WithSampler(new ConstSampler(true))
                .WithReporter(compositeReporter)
                .Build();

                if(!GlobalTracer.IsRegistered()) GlobalTracer.Register(tracer);

                return tracer;

            });
            services.AddHealthChecks();
        }