elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.6k stars 8.22k forks source link

[Fleet] dynamic_template mappings for wildcard field names are not installed #129344

Closed joshdover closed 2 years ago

joshdover commented 2 years ago

Integration packages have support for configuring wildcard field names in data streams. These fields are supposed to have an associated dynamic_template added to their mappings, however today, Fleet does not generate mappings for these. AFAICT, we have never had support for this in Fleet EPM code.

From @jsoriano:

When a field has a wildcard in the name, it must be installed in the template as a dynamic mapping. object_type would be the type that the dynamic mapping should produce (mapping.type). object_type_mapping_type is the type that the field in the document has to have to match the dynamic mapping (match_mapping_type), a wildcard there would mean any type.

Here's an example of how this works in Beats. A field defined like this:

    - name: prometheus.*.histogram
      type: object
      object_type: histogram
      object_type_mapping_type: "*"

Results in a dynamic template added to the mappings like this:

{
  ...
  "template": {
    "mappings": {
      ...
      "dynamic_templates": [
        ...
        {
          "prometheus.*.histogram": {
            "mapping": {
              "type": "histogram"
            },
            "match_mapping_type": "*",
            "path_match": "prometheus.*.histogram"
          }
        },
  ...
}

We should be able to follow the Beats implementation to make this match on Fleet: https://github.com/elastic/beats/blob/ea207346d651448b8917b0791b2b117b9f9b9212/libbeat/template/processor.go#L444

We have several integrations using this feature today, but it's likely broken for customers due to this problem. In my rudimentary search on the integrations repo, I found that this is affecting these integrations:

elasticmachine commented 2 years ago

Pinging @elastic/fleet (Feature:EPM)

elasticmachine commented 2 years ago

Pinging @elastic/fleet (Team:Fleet)

joshdover commented 2 years ago

We should consider backporting a fix for this to 7.17.x depending on impact

jsoriano commented 2 years ago

Actually, looking to beats mappings, the field doesn't need to have a wildcard. A field with object_type and without wildcards is assumed to have the wildcard at the end:

  - name: labels
    type: object
    object_type: keyword

Is installed as:

        {
          "labels": {
            "mapping": {
              "type": "keyword"
            },
            "match_mapping_type": "string",
            "path_match": "labels.*"
          }
        },

+1 to backport this to 7.17.

jsoriano commented 2 years ago

Another case that can produce dynamic mappings, the use of the dynamic_template flag can configure a normal mapping as dynamic instead: https://github.com/elastic/package-spec/issues/171

jsoriano commented 2 years ago

Another case that can produce dynamic mappings, the use of the dynamic_template flag can configure a normal mapping as dynamic instead: elastic/package-spec#171

Ah no sorry, this feature was reverted (https://github.com/elastic/package-spec/pull/212, https://github.com/elastic/kibana/issues/102289#issuecomment-904640242).

joshdover commented 2 years ago

@jsoriano What might be helpful is to have the Beats source code to look at. Do you know where this lives?

joshdover commented 2 years ago

Changed the wording to be more clear that this is really a bug

joshdover commented 2 years ago

@elastic/integrations It would be helpful to understand the impact of this issue on the integrations listed above so that we can prioritize appropriately.

jsoriano commented 2 years ago

@jsoriano What might be helpful is to have the Beats source code to look at. Do you know where this lives?

Here it is: https://github.com/elastic/beats/blob/ea207346d651448b8917b0791b2b117b9f9b9212/libbeat/template/processor.go#L444

jsoriano commented 2 years ago

Regarding impact, we have to take into account that these fields will be used as if they had no mapping, what uses to be a source of problems.

I think we are not seeing more issues related to this because the most visible issue of this is mappings conflict, and looking to the default rules, this will be only problematic if:

Apart of the mapping conflicts, there may be more subtle consequences of not having proper mappings, at least:

andrewkroh commented 2 years ago

I think we need to go through the process to update the package-spec schema for fields.yml files. This way we have a clear definition of the expected behavior of the options. This will help both the developers of integrations and implementors of the specification in Fleet.

(I've started updating the specification to match what the code actually does in Fleet EPM today by looking at the typescript code. https://github.com/andrewkroh/package-spec/commit/91d651386ab56398e5c7872e26e7ebf360c70b41)

joshdover commented 2 years ago

(I've started updating the specification to match what the code actually does in Fleet EPM today by looking at the typescript code. andrewkroh/package-spec@91d6513)

Thank you, @andrewkroh. Agreed, we need to update the spec to match. Happy to have Fleet UI team help review once you have a PR.

jsoriano commented 2 years ago

(I've started updating the specification to match what the code actually does in Fleet EPM today by looking at the typescript code. andrewkroh/package-spec@91d6513)

Thanks @andrewkroh! Could you open a PR with this change so we can continue the discussion there?

In any case we need to support dynamic templates, and this brings the need of wildcards in names and supporting the object_type options.

andrewkroh commented 2 years ago

Here's that change as a PR to package-spec https://github.com/elastic/package-spec/pull/314. The goal there will be to sync up the spec to the current implementation. Then we can do a separate PR propose how dynamic mappings should work.

ritalwar commented 2 years ago

Hi, I am working on moving Cockroachdb integration to GA and in process of that I am getting error “field cannot be changed from type [float] to [long], dropping event!”. I found related SDH https://github.com/elastic/sdh-beats/issues/1809 which is similar to the issue am getting, Could you please confirm if this issue (dynamic template mapping) will resolve all such conversion errors and when is the fix expected?

jsoriano commented 2 years ago

Could you please confirm if this issue (dynamic template mapping) will resolve all such conversion errors and when is the fix expected?

Probably yes, CockroachDB package provides fields definitions (see here), but most of them are dynamic mappings, that are not being properly installed.

andresrc commented 2 years ago

@joshdover @jlind23 do we have a timeline for this issue?

jsoriano commented 2 years ago

how dynamic mappings should work.

Regarding this, I think they should work the same way as in Beats. There may be limitations, but there are also strong advantages:

jen-huang commented 2 years ago

@andresrc This is a Fleet issue that hasn't been bubbled up in priority yet. Is this blocking some packages going GA? (like Cockroach DB?)

andresrc commented 2 years ago

@jen-huang yes, the team is working on CockroachDB in this iteration, thanks

jen-huang commented 2 years ago

Bumped up to P1 per discussion with @akshay-saraswat as it is blocking CockroachDB integration from going GA.

jsoriano commented 2 years ago

Bumped up to P1 per discussion with @akshay-saraswat as it is blocking CockroachDB integration from going GA.

Thanks for prioritizing this!

Take into account that this is not only about CockroachDB. Grepping around integrations, many packages have at least one field with type: object (119), object_type (110) or a name with a wildcard (20). All of them should result on dynamic mappings, and they are not now, this can lead to unexpected mapping conflicts. I guess we are not seeing more issues caused by this because we are being lucky on the use we do on these fields and default automatic mappings are working well for us (https://github.com/elastic/kibana/issues/129344#issuecomment-1087647490), but it may happen anytime that a customer faces issues because of this on corner cases or new integrations.

crespocarlos commented 2 years ago

I've seen the same error in the Prometheus package and I've added a dynamic template to the data_stream manually to fix it. If I understood correctly, by fixing what's been discussed here, we wouldn't need to create the dynamic mapping at the package level, right?

tetianakravchenko commented 2 years ago

related to the question above: would it be safe to add a dynamic_templates to the data_stream manually, go to GA (prometheus integration is planned to go GA in 8.5) and remove dynamic_templates later? any timelines on the fix?

nchaulet commented 2 years ago

Adding the dynamic_templates to the datastreams will work for now, but it will create a little more work when we generate mapping and implement dynamic_templates for wildcard field names as we will have to merge two source of dynamic_templates (but I guess we will have to do so as some integrations are already using this)

any timelines on the fix

I am currently looking at this and it's seems not it's not a 1 for 1 change from the beats code to Fleet as we generate mappings a little differently, but I should be able to have a POC/draft PR early this. This change seems a little risky to me as it will impact a lot of package and we probably need to decide if it's a bug fix for 8.4 or an improvement in 8.5 cc @jen-huang

jen-huang commented 2 years ago

@nchaulet Thanks for investigating. Let's continue working on the fix for this and then assess the risk from there. My feeling is that we may want to delay it to 8.5...

nchaulet commented 2 years ago

@jsoriano When you say we should create a dynamic template for field with a name with a wildcard it is a new feature? I am not able to the code path that does this in beats

jsoriano commented 2 years ago

@jsoriano When you say we should create a dynamic template for field with a name with a wildcard it is a new feature? I am not able to the code path that does this in beats

@nchaulet oh, you are right, I didn't notice. It only happens to be that almost all fields with a wildcard in their name also have type: object in beats. TLDR; Yes, this would be a new feature, but probably it is worth to do it.

I say almost because I have found this one in Metricbeat, that effectively doesn't generate a dynamic mapping:

      - name: pod.preemption.victims.bucket.*
        type: long
        description: Pod preemption victims

In integrations repo I see that there are more fileds with wildcards but without type: object, for example https://github.com/elastic/integrations/blob/main/packages/awsfargate/data_stream/task_stats/fields/fields.yml#L28-L32

        - name: core.*.pct
          type: scaled_float

Such field definitions both in beats and fleet generate static mappings with * as property names, that I don't think is what the developer expects.

I think we should do something about this, so developers have what they probably expect. I see these options:

From these options I would prefer the first one, so we would have a better syntax for many of these use cases, and it would support the integrations that are already using it.

Btw, while looking at this I have found that Beats generates both, the dynamic mapping and an static mapping with the wildcard, this is probably unexpected, I will open another issue in Beats about that.