open-telemetry / opentelemetry-specification

Specifications for OpenTelemetry
https://opentelemetry.io
Apache License 2.0
3.75k stars 889 forks source link

Event is one of the major verticals as well as Metrics, Tracing and Logging in observability from my perspective #176

Closed qudongfang closed 4 years ago

qudongfang commented 5 years ago

As we stated:

In software, observability typically refers to telemetry produced by services and is divided into three major verticals:

I think we should consider Events as one of the major verticals in open-telemetry too. Such as machine reboot, process core dump, deployment, a system kernel error, and even Java Exceptions.

Products like sentry and cloudtrail.

https://en.wikipedia.org/wiki/Event_monitoring

liubin commented 5 years ago

Agree with that Events is import too. SaaS, like datadog also provide events support.

And also, I'm prefer to add events to span object as property.

Here is an example in real word. I'm measuring the container startup time in K8s, one component is kubelet, the work node agent in K8s. The container startup process in one work node including pull image, create container and run the container, but we need not make all this step as separated spans, we can make it as one span including some events:

{
    "service": "kubelet",
    "operation": "start_pod",
    "startTimestamp": "2019-07-05T19:17:55.961571091+08:00",
    "duration": "3000",
    "events": [
        {
            "level": "info",
            "summary": "image nginx pulled",
            "message": "Container image "nginx:1.17.0-alpine" already present on machine",
            "timestamp: "2019-07-05T19:17:56.961571091+08:00",
        },
        {
            "level": "info",
            "summary": "container created",
            "message": "Created container",
            "timestamp: "2019-07-05T19:17:57.961571091+08:00",
        },
        ... ...
    ],
    ... ...
}
lizthegrey commented 5 years ago

To me, an Event is just a special case of a Trace, with 1 span.

qudongfang commented 5 years ago

To me, an Event is just a special case of a Trace, with 1 span.

Yeah. It is one way to look at it.

There are a few differences between Span(trace) and Event in my opinion.

  1. The end/finish time of an event is changeable (determined later)
  2. Events are related across the whole IaaS, It is hard to relate(instrument) them using Span Id.
  3. The query scenarios are more diverse than Span. Statistical analysis is one case. Such as how many deployments last week, how many network breakdown last year?
  4. Events may be merged later on. For example, lots of alert events occurred because of the crash of the Cache service, we might want to merge all these related alert events into one incident, It would be hard to do this using Span model.
mtwo commented 5 years ago

Alternatively, could we treat events as special logs with a known structure?

qudongfang commented 5 years ago

Alternatively, could we treat events as special logs with a known structure?

Yes, We could.

At the same time, We can treat Logs as Spans(Trace) too. Why haven't we done that?

tigrannajaryan commented 5 years ago

Bumping this up since it was referenced from #67.

What are Events if not just Logs?

Alternatively, could we treat events as special logs with a known structure?

Structured logs have been a thing for a long while. There is an RFC that defines how structured logs should be represented in text files [1] and AFAIK many modern logging backends support this format, in addition to other ways to ingest structured logs (e.g. as JSON).

On Windows the system and application logs are even called just that: Event Logs [2] and are structured.

So, the question is how are Events different from Logs? In what way?

[1] https://tools.ietf.org/html/rfc5424#page-15 [2] https://docs.microsoft.com/en-us/windows/win32/wes/windows-event-log

pauldraper commented 4 years ago

Yes, Events are Logs.

lizthegrey commented 4 years ago

I feel that Logs are a special class of events, where the payload is just {time: "yyyy-mm-dd hh:mm:ss +Z", data: "entire log line goes here"}

pauldraper commented 4 years ago

I feel that Logs are a special class of events, where the payload is just {time: "yyyy-mm-dd hh:mm:ss +Z", data: "entire log line goes here"}

Call it "structured logging" then, where structured attributes can be associated.

Whether you want to call it events or logs, there isn't a reason to think of them separately.

Such as machine reboot, process core dump, deployment, a system kernel error, and even Java Exceptions.

All in logs.

avik-so commented 4 years ago

The way I see it, metrics track events (and other things), while traces and logs describe events. It's events all the way down. I think it would be beneficial in making that relationship clear in the documentation.

Oberon00 commented 4 years ago

I think this issue is not actionable as-is. Can it be closed?

tigrannajaryan commented 4 years ago

I think this issue is not actionable as-is. Can it be closed?

I vote for closing for reasons I outlined above.

carlosalberto commented 4 years ago

Closing this issue as it is non actionable and because there are existing reasons to do so.

In any case, feel free to re-open (or open a new issue) if you think this still needs to be addressed in some form.

bigman73 commented 2 years ago

It's 2022 - New Relic and Datadog (arguably two of the most dominant APM players) have well defined and separate APIs and documentation for events. I think OTEL should catch up with the industry and add events as a first class citizen. An event is not a tracing span because events can be emitted regardless of a user tracing session. For example, an upgrade event: "The system was upgraded to v1.2.3" with a bunch of supporting information, all in a well defined schema.

To those that say events are logs, logs are unstructured and events are structured. APMs have different APIs for events. Treating events as logs is not helpful at all when integrating with Datadog, for example. Events are also not tracing spans with one span. That is one implementation method. The logical definition of events is different from traces.

https://docs.newrelic.com/docs/data-apis/understand-data/new-relic-data-types/#event-data https://docs.datadoghq.com/events/

scheler commented 2 years ago

@bigman73 OpenTelemetry now recognizes Events as a first-class citizen - there's now an API to create Events - it uses LogRecord as its underlying data model and defines semantic conventions for structuring Events using LogRecords

bigman73 commented 2 years ago

@scheler Thanks. Is there an exporter of events (as events, not logs) into Datadog or New Relic APMs?

jack-berg commented 2 years ago

Speaking for New Relic, we accept OpenTelemetry data via OTLP. From OTLP's perspective, both logs and events use the log data model, and that's fine with us. We're currently working on a strategy that would treat OpenTelemetry events in a similar way as New Relic events are treated today. Can't provide specifics quite yet.

pauldraper commented 2 years ago

To those that say events are logs, logs are unstructured and events are structured.

Opentelemetry logs (or the thing it calls logs) are in fact structured.