open-telemetry / opentelemetry-specification

Specifications for OpenTelemetry
https://opentelemetry.io
Apache License 2.0
3.75k stars 889 forks source link

Clarification of naming and structure used in Logs Data Model #4175

Open arno-jueschke opened 3 months ago

arno-jueschke commented 3 months ago

In the Logs Data Model (https://opentelemetry.io/docs/specs/otel/logs/data-model/) specification a log record is defined containing following fields:

Timestamp
ObservedTimestamp TraceId SpanId TraceFlags SeverityText SeverityNumber Body Resource InstrumentationScope Attributes

The protobuf definition (https://github.com/open-telemetry/opentelemetry-proto/blob/v1.3.2/opentelemetry/proto/logs/v1/logs.proto) for LogRecord uses these fields:

time_unix_nano observed_time_unix_nano severity_number severity_text body attributes flags trace_id span_id

The example for log record in json (https://github.com/open-telemetry/opentelemetry-proto/blob/v1.3.2/examples/logs.json) uses:

timeUnixNano observedTimeUnixNano severityNumber severityText traceId spanId body attributes

Suppose, someone wants to store log records as json documents in a log file as compliant as possible to the Logs Data Model. The log records are emitted from several components, let's say.

Note: 1) For semantic conventions (e.g., https://github.com/open-telemetry/semantic-conventions/blob/main/docs/resource/README.md) the definition is more precise (strict). 2) The Logs Data Model explains for InstrumentationScope and Resource that "Multiple occurrences of events coming from the same scope can happen across time and they all have the same value".

What did you expect to see? Guidance on usage of consistent naming

svrnm commented 3 months ago

Hey @arno-jueschke,

thank you for raising this issue. The difference in notation you see comes from different requirements:

So, to answer your question, it depends on the guidelines&best practices of the solutions you are using. Making an assumption here based on the "JSON Documents in a log file", I would suggest you follow the JSON mapping (camelCase) as suggested in the OTLP spec.

Same for your second question, it depends on what you use and your use cases. Storing Instrumentation Scope & Resources with each record individually, has different advantages/disadvantages to grouping them, or storing them in a separate place and create a relationship). You need to make the analysis yourself, depending on what you'd like to accomplish, e.g. is storage more important to you, or quick access, or to convert back-and-forth into different formats, etc. will lead to different answers.

arno-jueschke commented 3 months ago

Hello @svrnm ,

thank you for the answer. To summarize, the logs data model specifies the content from a conceptual point of view. The concrete field names depend on the used technology and the conventions there.

Is the same true for the semantic conventions?

svrnm commented 3 months ago

thank you for the answer. To summarize, the logs data model specifies the content from a conceptual point of view. The concrete field names depend on the used technology and the conventions there.

That's my understanding, yes. But I also defer that from reading the specification

Is the same true for the semantic conventions?

I don't know, that's a question worth asking in the sem conv repo.

mtwo commented 1 month ago

Hey @arno-jueschke! Given that you're writing log files, I'd just write them in the OTLP JSON format, as that's consistent with OTLP and is already the format that the Collector OTLP file exporter writes data to disk with.

If this answers your question, can you close this issue?