elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.64k stars 8.22k forks source link

host.ip and source.ip have conflicting data types in the metrics-* index pattern #73797

Open hendry-lim opened 4 years ago

hendry-lim commented 4 years ago

Kibana version: 7.9.0-SNAPSHOT

Elasticsearch version: 7.9.0-SNAPSHOT

Original install method (e.g. download page, yum, from source, etc.): Docker

Describe the bug: There are 2 conflicting fields, namely host.ip and source.ip in the default metrics-* index pattern.

Steps to reproduce:

  1. Enroll and run elastic-agent-7.9.0-SNAPSHOT (with system integration enabled)
  2. In Kibana, go to Stack Management --> Index Patterns
  3. Open the metrics-* index pattern
  4. Refresh the field list

Expected behavior: host.ip and source.ip should be mapped to ip data type.

Screenshots (if relevant): image

image

elasticmachine commented 4 years ago

Pinging @elastic/logs-metrics-ui (Team:logs-metrics-ui)

hendry-lim commented 3 years ago

source.ip is still a conflict in 7.12.0. image

image

sgrodzicki commented 3 years ago

Pinging @elastic/fleet

skh commented 3 years ago

This might be an issue with the integrations, cc @elastic/integrations

ruflin commented 3 years ago

The right type should be ip. @fearful-symmetry it seems linux package does not use it?

@jen-huang @mattkime One other issue with index patterns. In an ideal world, the above conflicts would not happen but in reality we will see this more and more often. The problem above comes from the fact that we have metrics-* and all packages install their fields there. In querying Elasticsearch this does not matter as it is much more granular. And if queried globally, AFAIK Elasticsearch supports the above case.

fearful-symmetry commented 3 years ago

A while ago I tried to purge all the errant keyword ip mappings from things, looks like I missed one. Should be an easy fix, at least.

ruflin commented 3 years ago

@fearful-symmetry Can you follow up on this with a fix? @ycombinator @mtojek It would be nice if we could detect that a package uses an ECS mapping but has not defined it. Or going further, uses fields that are not defined and then if the field is defined and conflicts with ECS, the check fails.

mattkime commented 3 years ago

@ruflin Is there a something that you'd like to see changed in index patterns?

fearful-symmetry commented 3 years ago

Yep. Wanted to fix this last week, ended up getting sidetracked.

fearful-symmetry commented 3 years ago

Fixing this now, but the more that I look at this, the more I wonder what source.ip is doing in linux/users, since that's just...users. I assume it's a metadata processor or something somewhere?

fearful-symmetry commented 3 years ago

@hendry-lim ^

fearful-symmetry commented 3 years ago

Ah, never mind, it reports the IP if there's a remote user.

ruflin commented 3 years ago

@mattkime I think the problem lies here not necessarily directly with index pattern as it is correct that there is a conflict. It is the more fundamental issue that we require a metrics-* index pattern in the first place. The short term problem is that it shows a big error even for user ingested data but actually from a query perspective, all is fine. So maybe done done the error a bit or make it a config option on the index pattern if it should be shown or not?

ycombinator commented 3 years ago

@ycombinator @mtojek It would be nice if we could detect that a package uses an ECS mapping but has not defined it. Or going further, uses fields that are not defined and then if the field is defined and conflicts with ECS, the check fails.

This can be something handled by elastic-package lint. But for this the tool will first need to know which fields are ECS fields, i.e. a single source of truth for all ECS mappings / field definitions.

In fact, once we have such a single source of truth, we can take this a step further. Packages can then simply reference an ECS field name without needing to specify any more information about the field like its data type, description, etc. At build time (elastic-package build) the relevant mappings would get pulled from the single source of truth into the package's field definition files (e.g. automatically construct a fields/ecs.yml for that package's data streams). This would prevent conflicts such as the one in this issue from occurring in the first place.

WDYT @mtojek @ruflin? We can flesh this out further in a dedicated issue in the elastic-package repo if you agree with the general idea.

ebeahan commented 3 years ago

In fact, once we have such a single source of truth, we can take this a step further. Packages can then simply reference an ECS field name without needing to specify any more information about the field like its data type, description, etc. At build time (elastic-package build) the relevant mappings would get pulled from the single source of truth into the package's field definition files (e.g. automatically construct a fields/ecs.yml for that package's data streams). This would prevent conflicts such as the one in this issue from occurring in the first place.

@ycombinator This sounds fantastic! I think it would help alleviate some of the ECS field duplication burden mentioned in https://github.com/elastic/package-spec/issues/63

ycombinator commented 3 years ago

Ah, thanks for linking to https://github.com/elastic/package-spec/issues/63, @ebeahan. Sounds like that issue is proposing pretty much exactly what I proposed above as well!

ruflin commented 3 years ago

Lets continue the discussion in https://github.com/elastic/package-spec/issues/63 around the validation. Thanks @ebeahan for referencing in.