jaegertracing / jaeger

CNCF Jaeger, a Distributed Tracing Platform
https://www.jaegertracing.io/
Apache License 2.0
20.6k stars 2.45k forks source link

[Bug]: Jaeger Query Should Accept Configurable Jaeger Collector Address when enabled query.enable-tracing #6122

Closed masihkhatibzadeh99 closed 2 weeks ago

masihkhatibzadeh99 commented 1 month ago

What happened?

Currently, when tracing is enabled (enable-tracing), jaeger-query defaults to localhost:4317 as the OTLP endpoint, which assumes that the Jaeger Collector is available on the same local address. This setup causes issues where Jaeger-query and the Jaeger Collector typically run in separate pods or hosts.

When jaeger-query cannot reach the Jaeger Collector at localhost:4317, tracing data fails to be exported, and the following connection errors are logged: {"level":"info","ts":1729846746.785025,"caller":"grpc@v1.66.0/clientconn.go:1319","msg":"[core] Creating new client transport to \"{Addr: \\\"127.0.0.1:4317\\\", ServerName: \\\"localhost:4317\\\", }\": connection error: desc = \"transport: Error while dialing: dial tcp 127.0.0.1:4317: connect: connection refused\""}

Steps to reproduce

  1. Deploy Jaeger-query in a k8s environment with Jaeger collector in separate pods.
  2. Enable tracing on Jaeger-query. (--query.enable-tracing = true)

Expected behavior

jaeger-query should accept a configurable Jaeger Collector address when enable-tracing is enabled. Users should be able to specify this address through something like this: --query.collector-address

Jaeger backend version

v1.60.0

masihkhatibzadeh99 commented 1 month ago

If this is an issue, I’m willing to create a PR to fix it.

yurishkuro commented 1 month ago

OTEL SDK supports a number of env variables that allow you to override export endpoint. Did you try that?

masihkhatibzadeh99 commented 1 month ago

Yes @yurishkuro, I tried to use this env OTEL_EXPORTER_OTLP_ENDPOINT as described here. But didn't work.

yurishkuro commented 1 month ago

the env var should be respected, but the code adds otlptracegrpc.WithInsecure() option.

In contrast, in the hotrod example we only add the flag is this returns false:

// withSecure instructs the client to use HTTPS scheme, instead of hotrod's desired default HTTP
func withSecure() bool {
    return strings.HasPrefix(os.Getenv("OTEL_EXPORTER_OTLP_ENDPOINT"), "https://") ||
        strings.ToLower(os.Getenv("OTEL_EXPORTER_OTLP_INSECURE")) == "false"
}
yurishkuro commented 4 weeks ago

I will keep this open, I think it's worth fixing.

masihkhatibzadeh99 commented 4 weeks ago

Can I contribute to fixing this issue? @yurishkuro

yurishkuro commented 4 weeks ago

of course. I'd recommend posting here how you'd want to fix that (rather than spending time writing code in the direction on which we might later disagree).

masihkhatibzadeh99 commented 4 weeks ago

I think we should accept a parameter like--query.collector-address to get the address of the collector, and then consider this address as an OTLP_ENDPOINT instead of using os.getenv.

yurishkuro commented 4 weeks ago

no, I don't agree with that. The OTEL SDK is configurable via env vars today. We need to use these standard mechanisms, there is no need for some custom parameters.

masihkhatibzadeh99 commented 3 weeks ago

@yurishkuro Can we add some explanations to this section (jaeger-query) about using the OTEL_ENDPOINT variable? or do you have any suggestions for fixing it?

akstron commented 3 weeks ago

the env var should be respected, but the code adds otlptracegrpc.WithInsecure() option. In contrast, in the hotrod example we only add the flag is this returns false:

Hi @yurishkuro , does this option (otlptracegrpc.WithInsecure()) forces the exporter to not use OTEL_EXPORTER_OTLP_ENDPOINT? I tried checking the otel exporter code but couldn't find anything related to it?

yurishkuro commented 3 weeks ago

it doesn't prevent the use of env var, but it prevents (I assume) SDK from switching to TLS. In the HotROD example we check env var explicitly and don't call Insecure if the scheme is https

akstron commented 3 weeks ago

I see. For this issue, I think we can have a logic similar to HotROD. If we find the OTEL_EXPORTER_OTLP_ENDPOINT set, we don't explicitly use Insecure. Does this sound right?

yurishkuro commented 3 weeks ago

yes

akstron commented 3 weeks ago

Got it. Will work on it. Thanks