elastic / ecs-logging

ECS Logging - Common resources and issues for the language specific ECS loggers
https://www.elastic.co/guide/en/ecs-logging/overview/master/intro.html
Apache License 2.0
41 stars 15 forks source link

ECS logging libraries setting the `ecs.version` #44

Open simitt opened 3 years ago

simitt commented 3 years ago

When making the ECS logger and formatter libraries GA, most of them will be versioned independently of the stack or ECS, therefore library versions will be starting with v1.0.0.

The loggers set the ecs.version according to the ecs-logging spec. The current behavior for most loggers is to set the ecs.version to the minimal supported ECS version supported by this logger. This has the advantage that the logger libraries do not need to be updated and released on every ECS release. It also seems reasonable to use the minimal supported ECS version when querying log data.

@elastic/ecs does setting the ecs.version to the minimal supported version sound reasonable from an ECS perspective?

ebeahan commented 3 years ago

Ideally, ecs.version for each logger would be defined independently of each other, based on whichever schema version is adopted by that particular logger. Whenever work takes place for a logger to adopt a newer version of ECS, ecs.version would then be updated to reflect it.

Is that consistent with how the "minimal supported ECS version" is viewed here for each logger?

simitt commented 3 years ago

Ideally, ecs.version for each logger would be defined independently of each other

Agreed, they are independent. But since users might want to use multiple logger libraries for multiple languages it would be ideal if they behave in the same way.

Whenever work takes place for a logger to adopt a newer version of ECS, ecs.version would then be updated to reflect it.

Maybe thing important to know is that (most of) the loggers do not have a strict type support for all the fields defined in ECS. Eg the go-zap logger ensures that all the required fields from the logger spec are supported by auto-deriving them or providing dedicated methods for setting these fields, which ultimately ensures they have the right types associated with the ECS fields. All other fields can be set by the user without type verification.

Coming back to the ECS version - if I understand you correctly, then what you suggest is not aligned with how we are defining minimal supported version. Let's say ECS version 1.y introduces a new field a.b. A logger that supports the new field is still compatible with an older version 1.y-1, and therefore would still lists the minimal version with which it is compatible in the ecs.version. If I understand you correctly, you are suggesting that as soon as a.b is supported, the logger would put the version 1.y as the ecs.version. This would mean that from all the fields supported by the logger the one that was introduced last by ECS defines the ecs.version (at least if none of the other fields are in conflict with this version). I am not certain what advantage this has over setting the minimal compatible version, since the loggers are not supporting all defined ECS fields.

ebeahan commented 3 years ago

I am not certain what advantage this has over setting the minimal compatible version, since the loggers are not supporting all defined ECS fields.

The convention has been to have ecs.version reflect which version the data conforms to. When writing rules, searches, visualizations, etc., knowing what version of ECS the data conforms with help to know which fields are available and which ones are not. There's also a benefit of having ecs.version document the schema version a logger/integration/module was written against.

I don't think it's necessary to revisit each logger with every ECS release only to bump the versions. When a logger adopts new fields or features, I do think ecs.version should be updated appropriately.

felixbarny commented 3 years ago

One caveat is that users can add custom fields that may be ECS fields that have only been introduced at a later ECS version. But I guess that wouldn't be the end of the world.

Another thing I'm actually not really sure about is whether we could just not set the ecs.version in the logger and rely on Filebeat to set the version. This will also be the version that the index template corresponds to. Even if an old Filebeat version is used that doesn't have all of the logging fields in the index pattern it shouldn't be a big problem due to the default mapping of string fields to keyword.

felixbarny commented 3 years ago

@philkra Do you remember why we even set the ecs.version field in ECS loggers? It came in in the initial PR for the spec (https://github.com/elastic/ecs-logging/pull/18) in this commit: https://github.com/elastic/ecs-logging/pull/18/commits/bf687f0cc3123b51f6c6b08a15cbf16ce4dbb3d9.

philkra commented 3 years ago

I recall fully, but it was related to supportability of ECS versions. In other words, if something is odd, the ecs.version fields could be a good starting point to debug.

simitt commented 3 years ago

I am very inclined to keep the version as is (1.6.0) for the GA release of the ecs-zap logger, as it reflects the latest version with which the logger was matched with and no changes have been applied since then. It would not reflect the minimal supported version, nor would it mean that all ECS 1.6.0 fields are supported - but it is compatible with it, and would be bumped whenever a field from a newer version is added (aligned with what @ebeahan suggests in https://github.com/elastic/ecs-logging/issues/44#issuecomment-768545482 IIUC). In future versions I agree with @felixbarny we might make it optional or configurable.

trentm commented 3 years ago

Another thing I'm actually not really sure about is whether we could just not set the ecs.version in the logger and rely on Filebeat to set the version.

I was thinking about a CLI one could use for pretty-printing a stream of ecs-logging log records -- as mentioned at https://github.com/elastic/ecs-logging/issues/42#issuecomment-759537471 -- and that CLI is perhap a use case for wanting "ecs.version" in the log records emitted by applications. I would anticipate the CLI using the presence of "ecs.version" (along with the other 3 required fields) to pre-check that a log line is an ECS log record before trying to render it. The CLI could do without the "ecs.version", but it would make processing clearer, and perhaps safer if there are future incompatible changes in ECS fields that the CLI attempts to treat specially.

This is all a little bit a stretch, however. I don't think the CLI has a strong case for needing "ecs.version".

felixbarny commented 3 years ago

using the presence of "ecs.version" (along with the other 3 required fields) to pre-check that a log line is an ECS log record

Ah, right. That's not only useful for log renderers but could also be used by Filebeat or an ingest node to automatically parse ECS logs.