elastic / ecs-logging

ECS Logging - Common resources and issues for the language specific ECS loggers
https://www.elastic.co/guide/en/ecs-logging/overview/master/intro.html
Apache License 2.0
42 stars 16 forks source link

Clarifying the two kinds of ECS-related language libraries #19

Open webmat opened 4 years ago

webmat commented 4 years ago

The two kinds I see are the following (the naming here is up for discussion):

ECS Libraries

These libraries can ultimately be used to create or consume ECS events. Part of their goal is to represent the schema's key names and data types in the target language with fidelity, using the language's idioms (OOP, using the language's type closest to ECS', etc.)

Delivering these requires a lot of work to map ECS & Elasticsearch semantics to the target language. Each ECS release will require a release of this type of library, since it's meant to map all of the ECS field definitions.

In this category:

ECS Logging Formatters

I see these as "drop-in" log formatters. By "drop-in" I mean as a baseline, developers need not care about ECS and all its fields per se. They should be useable by developers who just want to provide a log level, a message and perhaps a few structured keys. In doing so, these log formatters should then populate the correct ECS fields based on no changes to the application (other than changing the log formatter).

I like the Java library's list of fields that can be populated, based on this getting started experience.

These libraries should also eventually allow users that care about ECS and want to fill specific additional ECS fields to do so.

Once we reach a baseline functionality for these libraries, I think they will require less adjustments and changes over time. They will mostly map to a stable subset of ECS fields (see like to Java lib above), so an ECS release won't require a release of the library. When users want to fill more ECS fields, this can be done by using strings and datatypes that fit well enough so that the JSON representation will successfully populate an ECS index. They don't need the full type safety and support of the language, they just need the JSON output to match.

In this category:

I think both of these types of libraries can play together if we want: pass an "ECS object" to a log formatter, and it populates appropriate all ECS fields. But I don't see this as necessary. I still see them as distinct types of libraries.

Next steps

Mpdreamz commented 4 years ago

FWIW .NET does the hybrid approach it has a library for the types, and the types themselves offer a method to serialize them to streams.

This serialization will follow the new spec and I think that's what you want in most usecases for ECS.

Our logging framework formatter integrations can lean on the ECS types to do the heavy lifting:

https://github.com/elastic/ecs-dotnet/blob/master/src/Elastic.CommonSchema.Serilog/EcsTextFormatter.cs#L26

This works well when your emitting a strict subset of ECS, but sometimes you want to create a superset of ECS and tack on properties while adhering to ECS as much as possible. We are still discussing how to expose that on the types.

For the logging usecase extra properties we currently stash under _metadata but in more applied usecases you want to create additional fields in more semantic places

An example of a more applied superset approach is this clients benchmarking data: https://github.com/elastic/clients-team/issues/1

Similarly we've built a Json exporter for .NET's micro benchmarking tool that we are keen to move over to ECS (as much as possible)

In both cases though we'd want them to lean on the types as possible and their (de)serialization routines that adhere to our formatting best practices. The idea being that if they do they will be able to be picked up by filebeat.

I don't believe types are a prerequisite at all though, we build that library because we immediately benefit from it in more then one use case. Also the expectation is that we'll write log formatters for atleast 4 logging frameworks, we can now rely on the types serialization instead of each formatter rolling their own.