elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.09k stars 24.83k forks source link

Inconsistent value of event.dataset in ES deprecation logs #83251

Open pgomulka opened 2 years ago

pgomulka commented 2 years ago

Elasticsearch Version

8.0

Installed Plugins

No response

Java Version

bundled

OS Version

macos

Problem Description

event.dataset values in ES logs are elasticsearch.server, elasticsearch.index_search_slowlog, elasticsearch.index_indexing_slowlog but for deprecation logs it is deprecation.elasticsearch

This probably originates from 7.x where the deprecation log had deprecation.elasticsearch type field, but other logs had simply server, index_search_slowlog and index_indexing_slowlog

Steps to Reproduce

generate deprecation logs. A sample:

{
  "@timestamp": "2022-01-27T11:48:45.809Z",
  "log.level": "CRITICAL",
  "data_stream.dataset": "deprecation.elasticsearch",
  "data_stream.namespace": "default",
  "data_stream.type": "logs",
  "elasticsearch.elastic_product_origin": "",
  "elasticsearch.event.category": "compatible_api",
  "elasticsearch.http.request.x_opaque_id": "v7app",
  "event.code": "create_index_with_types",
  "message": "[types removal] Using include_type_name in create index requests is deprecated. The parameter will be removed in the next major version.",
  "ecs.version": "1.2.0",
  "service.name": "ES_ECS",
  "event.dataset": "deprecation.elasticsearch",
  "process.thread.name": "elasticsearch[runTask-0][transport_worker][T#8]",
  "log.logger": "org.elasticsearch.deprecation.rest.action.admin.indices.RestCreateIndexAction",
  "trace.id": "0af7651916cd43dd8448eb211c80319c",
  "elasticsearch.cluster.uuid": "5alW33KLT16Lp1SevDqDSQ",
  "elasticsearch.node.id": "tVLnAGLgQum5ca6z50aqbw",
  "elasticsearch.node.name": "runTask-0",
  "elasticsearch.cluster.name": "runTask"
}

beats processor that is overriding this already https://github.com/elastic/beats/pull/30018/files#r794578451

original dicussion on the PR that introduced the change https://github.com/elastic/elasticsearch/pull/68737

we should also discuss datastream.dataset field. datastream.dataset is only set in deprecation logs

elasticmachine commented 2 years ago

Pinging @elastic/es-core-infra (Team:Core/Infra)

pgomulka commented 2 years ago

I spoke with @qhoxie and we agreed that the >bug label has to be discussed before we continue with this change as it might be breaking. I was hoping to change both event.dataset and data_stream.dataset to elasticsearch.deprecation.

We index deprecation to datastream by default and the datastream name is .logs-deprecation-elasticsearch.default which follow the pattern:

{type}-{dataset}-{namespace},
type = .logs
dataset = deprecation.{product} (e.g. `elasticsearch`)
namespace = default

@ruflin you did mention here that

The data_stream.dataset value in the document must always match the {dataset} part of the data stream,

  1. would changing data_stream.dataset be breaking the convention you mentioned?
  2. would changing event.dataset be a problem? We already have an override in filebeat for this field https://github.com/elastic/beats/pull/30018/files#r794578451

cc @pugnascotia

ruflin commented 2 years ago
  1. I assume you would also change the target data stream which becomes logs-elasticsearch.deprecation-default? So you MUST change the data_stream.dataset field.
  2. You can look at this in 2 ways. For the Beats data collection, I don't think it is a problem as the outcome is identical. Lets assume you use another tool for data collection, now you would suddenly have a different value which is not great. At the same time not sure if this is common use case we need to cater for.
pgomulka commented 2 years ago

I also confirmed with Kibana team and they do not use dataset = deprecation.{product} pattern In fact they use event.dataset : kibana.log which is similar to what this issue is trying to do. There is no separate deprecation log in kibana. https://github.com/elastic/beats/blob/main/filebeat/module/kibana/log/test/log.830.log-expected.json#L4