open-telemetry / opentelemetry-specification

Specifications for OpenTelemetry
https://opentelemetry.io
Apache License 2.0
3.64k stars 870 forks source link

Add stdout/json OTLP protocol (instead of OTLP file exporter) #4056

Closed pellared closed 1 month ago

pellared commented 1 month ago

What are you trying to achieve?

I saw/heard (forgive me no actual references, but I imagine many of you could heard the same) asks for having some standardized way of emitting telemetry to stdout. See: https://github.com/open-telemetry/opentelemetry-specification/issues/3565.

The most frequent reason is that some prefer to have logs emitted to stdout rather than OTLP so that

  1. they do not miss logs when application crashes
  2. logs are send continuously as opposed to being batched
  3. should have better performance than sending via HTTP/gRPC
  4. they already have some existing log pipelines/processors which provide them some value/features

I feel that this would be especially beneficial for logging where usage of fluentbit is popular and because there are reasons behind https://12factor.net/logs.

I believe that there could be similar desire for other signals. Therefore, I think we might want to have an OTLP exporter which sends telemetry to an output stream (stdout, stderr, file).

I think the easiest way would be to evolve http/json protocol so that it additionally contains information about the signals. E.g.

{ "type": "logs/v1/ResourceLogs", "content": <JSON Protobuf encoded payload> }

We could name such protocol stdout/json.

We could additionally define an OTEL_EXPORTER_OTLP_STREAM env var which would default to stdout (acceptable values: stdout, stderr, none). We may also consider using OTEL_EXPORTER_OTLP_STREAM to send telemetry to a file or create a seperate env var for this. Maybe it would be needed any

Additional context.

Solves https://github.com/open-telemetry/opentelemetry-specification/issues/3565

This would require changing:

The collector could also have a capability to receive the telemetry (stdout operator?) send via OTLP/stdout/json so that it can then export it furhter via OTLP/HTTP. This way even if the application crashes, the collector would still be able to pass the data via OTLP to the backend.

What is more some users may prefer to use a logging library appender/sinks/formatter that would emit stdout/json encoded output without going through the OTel SDK to save some performance.

Side note: We already have console exporters, but they are meant for debugging. This is an OTLP exporter which is meant for production use.

EDIT: I was not aware of OTLP File Exporter. Here are some reasons why it may be better to have it as a protocol instead of a separate exporter.

Alternative

Add env var support for OTLP File Exporter

pellared commented 1 month ago

PTAL @open-telemetry/specs-logs-approvers (especially @djaglowski as I think you are involved in related stuff)

theletterf commented 1 month ago

I find this quite useful in all scenarios where export through a network is not always straightforward / possible, for example.

marcalff commented 1 month ago

How is this different from the file exporter ?

https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/protocol/file-exporter.md

pellared commented 1 month ago

@marcalff, thanks a lot. I was not aware of it ❤️

The main difference are:

Do you think it is good to define the OTLP file exporter as separate exporter?

How a user can select OTLP file exporter a using OTEL_LOGS_EXPORTER?

marcalff commented 1 month ago

Do you think it is good to define the OTLP file exporter as separate exporter?

Yes, but this is my own opinion.

Currently, the OTLP exporter supports the HTTP and GRPC protocols, with a lot of common configuration options.

Having the same OTLP exporter also support FILE will lead to confusion, because many configurations options do not apply to file, for example all the SSL related options.

This is clearer I think:

And beside, looking at the technical dependencies point of view, one may not want to link with the gRPC library just to print json to a file, this is sensitive in particular for C++. Having separate exporters allows to pick (or not) a given exporter without having to link with too much third party code.

As for changing or expanding existing environment variables, this goes into the configuration territory, where major work is in progress.

See:

And for pointers to all relevant issues:

cc @jack-berg

pellared commented 1 month ago

@marcalff I think the fact whether an protocol is implemented as separate exporter is an implementation detail.

The question is whether one selecting it via env var (or configuration) should set

pellared commented 1 month ago

I think the current structure makes more sense as OTLP Exporters are capable of sending telemetry directly to the backend whereas the OTLP File Exporter only stores data to an intermediate storage (file).

Therefore, I find having OTEL_LOGS_EXPORTER=otlpfile OK and less confusing as this expoter is not sending data to an OTLP endpoint.

I am closing this issue and I plan creating a new one for adding env var support for https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/protocol/file-exporter.md.

@marcalff, thanks for your input.

cijothomas commented 1 month ago

they do not miss logs when application crashes logs are send continuously as opposed to being batched should have better performance than sending via HTTP/gRPC

We use operating system tracing techniques for achieving similar goals (and many more!) by exporting telemetry to Windows ETW, Linux user_events. There are such exporters available today in C++, Rust, .NET (Windows ETW only).

(One thing with stdout is the need for some synchronization mechanism/locking to avoid mixing of the output, which would impact performance/throughput, which are avoided in the above etw/user_events exporters)

Glad to see similar needs being discussed!