open-telemetry / opentelemetry-rust

The Rust OpenTelemetry implementation
https://opentelemetry.io
Apache License 2.0
1.78k stars 411 forks source link

Error using opentelemetry_otlp::new_exporter().tonic() #861

Open simonzy15 opened 2 years ago

simonzy15 commented 2 years ago

I'm regularly getting an error when sending traces: OpenTelemetry trace error occurred. Exporter otlp encountered the following error(s): the grpc server returns error (Unknown error): , detailed error message: transport error

Using this setup:

    opentelemetry::global::set_text_map_propagator(XrayPropagator::default());
    match tokio::runtime::Builder::new_current_thread()
        .enable_all()
        .build()
    {
        Ok(runtime) => {
            runtime.block_on(async {
                match opentelemetry_otlp::new_pipeline()
                    .tracing()
                    .with_exporter(opentelemetry_otlp::new_exporter().tonic())
                    .with_trace_config(
                        sdktrace::config()
                            .with_sampler(sdktrace::Sampler::AlwaysOn)
                            .with_id_generator(sdktrace::XrayIdGenerator::default()),
                    )
                    .install_simple()
                {
                    Ok(tracer) => {
                        // initialize global subscriber
                    }
                    Err(e) => {
                       // do something
                    }
                }
            });
        }
        Err(e) => {
            // do something
        }
    }
}

When I send a trace within the runtime it seems to be working but not from outside the runtime.

My understanding is that the exporter is set up and the subscriber is globally initialized on the current thread. It seems to be an error with tonic that after the exporter is built and the subscriber is initialized, the grpc transport layer immediately closes after the runtime finishes.

Initially, I had assumed that the exporter does not need to be within a runtime as the documentation also seems to initialize it synchronously but without it shows this build error with the following cargo dependencies: thread 'opentelemetry-exporter' panicked at 'dispatch dropped without returning error'

tokio = { version = "^1.16", features = ["full"] }
opentelemetry = "0.17.0"
twix14 commented 1 year ago

Seeing the same issue myself using the Tokio runtime @djc @jtescher @TommyCpp (sorry to tag you all but the issue seems to have been forgotten for awhile)

djc commented 1 year ago

When I send a trace within the runtime it seems to be working but not from outside the runtime.

Seems to me like this is user error and/or easy to workaround. It otherwise doesn't feel actionable to me as there is no clear issue (and apparently only a small amount of users are affected). So please spend some more time digging into this issue to understand the root cause. PRs to improve the errors here would also be great!

Nereuxofficial commented 1 year ago

I have the same Error. Is there anything i can do to let opentelemetry_otlp print more detailed logs?

Nereuxofficial commented 1 year ago

Here is the code I am currently using:

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
    // this reports panics
    let _guard = sentry::init((
        SENTRY_DSN,
        sentry::ClientOptions {
            release: sentry::release_name!(),
            ..Default::default()
        },
    ));
    // OpenTelemetry tracing
    let mut metadata = MetadataMap::new();
    metadata.insert("x-honeycomb-team", env!("HONEYCOMB_API_KEY").parse()?);
    metadata.insert("x-honeycomb-dataset", "duckblog".parse()?);
    let tracer = opentelemetry_otlp::new_pipeline()
        .tracing()
        .with_exporter(
            opentelemetry_otlp::new_exporter()
                .tonic()
                .with_metadata(metadata)
                .with_timeout(std::time::Duration::from_secs(3))
                .with_endpoint("https://api.honeycomb.io"),
        )
        .install_batch(opentelemetry::runtime::Tokio)?;
    let telemetry = tracing_opentelemetry::layer().with_tracer(tracer);
    // filter printed-out log statements according to the RUST_LOG env var
    let rust_log_var = std::env::var("RUST_LOG").unwrap_or_else(|_| "info".to_string());
    let log_filter = Targets::from_str(&rust_log_var)?;
    // different filter for traces sent to honeycomb
    let trace_filter = Targets::from_str("futile=info")?;
    Registry::default()
        .with(
            tracing_subscriber::fmt::layer().with_ansi(true), //.with_filter(log_filter)
        )
        .with(
            telemetry, //.with_filter(trace_filter)
        )
        .init();

(Note that i use opentelemetry_otlp 0.11 since otherwise some types are not compatible with tracing-opentelemetry) This does also error with install_simple. I will try http next

valkum commented 1 year ago

Do you build opentelemetry_otlp using tls or tls-roots?

Nereuxofficial commented 1 year ago

I think i was building using tls. I am not sure though and will try grpc and report with both tls and tls-roots though tls-roots should hopefully work

Nereuxofficial commented 1 year ago

Alright, with tls-roots it works! Thanks for your help!

mladedav commented 1 year ago

I got here by following the Quickstart in the documentation. When I tried to run it as is stated there, the example panics because it is not inside a tokio runtime. If I change the main to tokio async, I get the error mentioned here.

There was no attempt to connect to the default endpoint (or the specified one if I tried to override it).

I would love to understand the issue, but it's hard if I cannot get even the official quickstart to run because of this.

pksunkara commented 1 year ago

I am encountering this with a simple server as described in #1143. But weirdly, it only happens after quite a bit of time, and whenever it happens, it uses up all of my RAM (128 GB). Does anyone think this might be related to #1048?

leons727 commented 1 year ago

Running into the same issue with quickstart example using tonic/tokio:

OpenTelemetry trace error occurred. Exporter otlp encountered the following error(s): the grpc server returns error (Unknown error): , detailed error message: Service was not ready: transport error
westonpace commented 11 months ago

I've run into this as well with the Quickstart example. The problem, as best I can tell from debugging, is an error-on-shutdown type problem:

Unfortunately, there is no way that I can figure out to create a tonic exporter without using the pipeline stuff (there is a new_tonic method on SpanExporter but it requires a TonicConfig that cannot be constructed. I also have no access into opentelemetry_otlp's global tracer provider to clear it.

So for now I just put a big long async sleep at the end of my program and it all works.

djc commented 11 months ago

@westonpace thanks for digging into this!

iamfletch commented 10 months ago

@westonpace out of interest did the global::shutdown_tracer_provider() not work for you?

justinabrahms commented 7 months ago

I've encountered this now on my tokio server. This is happening not as part of shutdown, but rather as part of normal runtime.

The root cause was that routing was not possible within the docker container. I was able to vet this by seeing if this produced spans. Once this worked, the above error went away.

# Install the otel cli on ubuntu
$ apt update
$ apt install -y curl
$ cd /tmp
$ curl -OL https://go.dev/dl/go1.21.6.linux-amd64.tar.gz
$ tar xzf go1.21.6.linux-amd64.tar.gz
$ go/bin/go install github.com/equinix-labs/otel-cli@latest

# copy it into a container and then get into that container
$ sudo docker cp ~/go/bin/otel-cli  563fc000e95b:/tmp/
$ sudo docker exec -it 563fc000e95b bash

# Run a simple test
$ cd /tmp
$ ./otel-cli exec --name "test" --protocol grpc --endpoint http://localhost:4317 --verbose --tp-print echo 1 
0x0918 commented 1 month ago

Running into the same issue with quickstart example using tonic/tokio:使用 tonic/tokio 的快速入门示例遇到了同样的问题:

OpenTelemetry trace error occurred. Exporter otlp encountered the following error(s): the grpc server returns error (Unknown error): , detailed error message: Service was not ready: transport error

hi bro, how to solve it? My problem is the same