Open mtojek opened 4 years ago
one or more errors found in documents stored in metrics-apache.status-ep data stream: [0] field "cloud.availability_zone" is not defined
[1] field "cloud.instance.id" is not defined
[2] field "cloud.instance.name" is not defined
[3] field "cloud.machine.type" is not defined
[4] field "cloud.project.id" is not defined
[5] field "cloud.provider" is not defined
[6] field "event.duration" is not defined
[7] field "event.module" is not defined
[8] field "host.architecture" is not defined
[9] field "host.containerized" is not defined
[10] field "host.hostname" is not defined
[11] field "host.ip" is not defined
[12] field "host.mac" is not defined
[13] field "host.name" is not defined
[14] field "host.os.codename" is not defined
[15] field "host.os.family" is not defined
[16] field "host.os.kernel" is not defined
[17] field "host.os.name" is not defined
[18] field "host.os.platform" is not defined
[19] field "host.os.version" is not defined
[20] field "service.address" is not defined
one or more errors found in documents stored in logs-apache.access-ep data stream: [0] field "cloud.instance.id" is not defined
[1] field "cloud.instance.name" is not defined
[2] field "cloud.provider" is not defined
[3] field "event.dataset" is not defined
[4] field "event.outcome" is not defined
[5] field "host.architecture" is not defined
[6] field "host.containerized" is not defined
[7] field "host.hostname" is not defined
[8] field "host.ip" is not defined
[9] field "host.mac" is not defined
[10] field "host.name" is not defined
[11] field "host.os.codename" is not defined
[12] field "host.os.family" is not defined
[13] field "host.os.kernel" is not defined
[14] field "host.os.name" is not defined
[15] field "host.os.platform" is not defined
[16] field "host.os.version" is not defined
[17] field "input.type" is not defined
one or more errors found in documents stored in logs-apache.error-ep data stream: [0] field "cloud.availability_zone" is not defined
[1] field "cloud.instance.id" is not defined
[2] field "cloud.instance.name" is not defined
[3] field "cloud.machine.type" is not defined
[4] field "cloud.project.id" is not defined
[5] field "cloud.provider" is not defined
[6] field "event.category" is not defined
[7] field "event.kind" is not defined
[8] field "event.timezone" is not defined
[9] field "event.type" is not defined
[10] field "host.architecture" is not defined
[11] field "host.containerized" is not defined
[12] field "host.hostname" is not defined
[13] field "host.ip" is not defined
[14] field "host.mac" is not defined
[15] field "host.name" is not defined
[16] field "host.os.codename" is not defined
[17] field "host.os.family" is not defined
[18] field "host.os.kernel" is not defined
[19] field "host.os.name" is not defined
[20] field "host.os.platform" is not defined
[21] field "host.os.version" is not defined
[22] field "input.type" is not defined
Looking at this issue, it seems to be a good candidate to implement while we will introduce support for processors' fields in https://github.com/elastic/package-spec/issues/63 .
This issue is relatively old now, so I assume it isn't really important. I will close it for now.
I think we need to revisit the check for event.*
. There have been many reported issues due to a lack of mappings for various event
fields. I did not realize this wasn't being validated.
I think we need to revisit the check for event.*. There have been many reported issues due to a lack of mappings for various event fields. I did not realize this wasn't being validated.
I remember that we disabled them because it means that we have to add more exactly the same fields to every data stream. We tried to figure out an option to introduce common fields and the topic was deprioritized. I agree that's a high time to refresh the discussion :)
cc @jsoriano @jlind23
Reopening as the skip is still in the code, and there seems to be related issues that could be detected by the skipped validation.
I wanted to mention that there are two fields under event
that we probably do want to skip validating because Fleet automatically adds a mapping for them. Those are "event.agent_id_status" and "event.ingested".
Perhaps something like this:
diff --git a/internal/fields/validate.go b/internal/fields/validate.go
index afed354..f149e57 100644
--- a/internal/fields/validate.go
+++ b/internal/fields/validate.go
@@ -263,16 +263,29 @@ func skipValidationForField(key string) bool {
return isFieldFamilyMatching("agent", key) ||
isFieldFamilyMatching("elastic_agent", key) ||
isFieldFamilyMatching("cloud", key) || // too many common fields
- isFieldFamilyMatching("event", key) || // too many common fields
isFieldFamilyMatching("host", key) || // too many common fields
isFieldFamilyMatching("metricset", key) || // field is deprecated
- isFieldFamilyMatching("event.module", key) // field is deprecated
+ isFleetManagedMapping(key)
}
func isFieldFamilyMatching(family, key string) bool {
return key == family || strings.HasPrefix(key, family+".")
}
+// isFleetManagedMapping return true if the field is contained in a Fleet
+// managed component template that is added to all data streams.
+//
+// The template is controlled in elastic/kibana at:
+// https://github.com/elastic/kibana/blob/v8.1.0/x-pack/plugins/fleet/server/constants/fleet_es_assets.ts#L24-L39
+func isFleetManagedMapping(key string) bool {
+ switch key {
+ case "event.agent_id_status",
+ "event.ingested":
+ return true
+ }
+ return false
+}
+
func isFieldTypeFlattened(key string, fieldDefinitions []FieldDefinition) bool {
definition := FindElementDefinition(key, fieldDefinitions)
return definition != nil && definition.Type == "flattened"
Gave a try to this in the context of elastic/package-spec#399, but I think that we need some kind of centralized schema for data providers (https://github.com/elastic/package-spec/issues/199). So Fleet can install the mappings for these providers, and elastic-package can also query this data for validations of these fields.
Follow-up on: https://github.com/elastic/elastic-package/pull/143#discussion_r511698446 https://github.com/elastic/elastic-package/pull/143#issuecomment-712994535
Currently we skip validation for generic fields present in every (most?) document collected by Elastic Agents but undefined in
fields.yml
files (not part of integrations). Sample fields:agent.*
,elastic_agent.*
.The goal is to introduce validation for the special fields as well.