open-telemetry / opentelemetry-cpp

The OpenTelemetry C++ Client
https://opentelemetry.io/
Apache License 2.0
817 stars 392 forks source link

[SDK] BatchLogRecordProcessor::ForceFlush hangs for 10 seconds #2583

Closed ggaudeau closed 1 month ago

ggaudeau commented 3 months ago

Platform : Ubuntu 23.10 Build: opentelemetry 1.11.0 with WITH_ASYNC_EXPORT_PREVIEW enabled Deps: curl 7.85.0

Here are the lines of code used to initialize the logger provider:

opentelemetry::exporter::otlp::OtlpHttpLogRecordExporterOptions opts;

auto exporter =
    opentelemetry::exporter::otlp::OtlpHttpLogRecordExporterFactory::Create(
        opts);

opentelemetry::sdk::logs::BatchLogRecordProcessorOptions batchOptions;

auto processor =
    opentelemetry::sdk::logs::BatchLogRecordProcessorFactory::Create(
        std::move(exporter),
        batchOptions);

auto resource = opentelemetry::sdk::resource::Resource::Create(attributes);

std::shared_ptr<opentelemetry::logs::LoggerProvider> provider =
    opentelemetry::sdk::logs::LoggerProviderFactory::Create(
        std::move(processor),
        resource);

opentelemetry::logs::Provider::SetLoggerProvider(provider);

Before quitting my application I do a ForceFlush with a timeout equal to 1 second.

auto loggerProvider =
            reinterpret_cast<opentelemetry::sdk::logs::LoggerProvider*>(
                provider);

loggerProvider->ForceFlush(std::chrono::seconds(1));

std::shared_ptr<opentelemetry::logs::LoggerProvider> none;
opentelemetry::logs::Provider::SetLoggerProvider(none);

When the network connection is not good, it can happen that my application hangs for 10 seconds, which corresponds to the timeout of the curl request which I don't want to change.

I'd like to know why the timeout is not taken into account.

owent commented 3 months ago

ForceFlush will wait for 1 second, but opentelemetry::logs::Provider::SetLoggerProvider(none) will make internal exporters shutdown can wait for all pending requests to finish. Could you please try to call loggerProvider->Shutdown(std::chrono::seconds(1)); instead of loggerProvider->ForceFlush(std::chrono::seconds(1));. This function will try to cancel all pending requests when got progress callback called by libcurl.

ggaudeau commented 3 months ago

Being in asynchronous mode, the Shutdown will cancel the sending of the last logs, which I'd like to avoid.

image

To be more accurate, when I set a one-second timeout and there are network problems, the app stays five seconds at line 119 and almost five seconds again in block 129-138

Basically, I switched to asynchronous mode because the processor was stuck in the Export method when I asked it to ForceFlush or Shutdown

ggaudeau commented 3 months ago

I am using Charles application to simulate throttling.

esigo commented 3 months ago

SIG meeting 11.03.2024: need a decision on which timeout should be used for shutdown, and if we need to flush the last buffer. would be great to check the implementation in the other repos and have a consistent solution as other repos.

github-actions[bot] commented 1 month ago

This issue was marked as stale due to lack of activity.

owent commented 1 month ago

2584 may also solve this problem.