Closed joshdover closed 2 years ago
Pinging @elastic/fleet (Feature:EPM)
Pinging @elastic/fleet (Team:Fleet)
We should consider backporting a fix for this to 7.17.x depending on impact
Actually, looking to beats mappings, the field doesn't need to have a wildcard. A field with object_type
and without wildcards is assumed to have the wildcard at the end:
- name: labels
type: object
object_type: keyword
Is installed as:
{
"labels": {
"mapping": {
"type": "keyword"
},
"match_mapping_type": "string",
"path_match": "labels.*"
}
},
+1 to backport this to 7.17.
Another case that can produce dynamic mappings, the use of the dynamic_template
flag can configure a normal mapping as dynamic instead: https://github.com/elastic/package-spec/issues/171
Another case that can produce dynamic mappings, the use of the
dynamic_template
flag can configure a normal mapping as dynamic instead: elastic/package-spec#171
Ah no sorry, this feature was reverted (https://github.com/elastic/package-spec/pull/212, https://github.com/elastic/kibana/issues/102289#issuecomment-904640242).
@jsoriano What might be helpful is to have the Beats source code to look at. Do you know where this lives?
Changed the wording to be more clear that this is really a bug
@elastic/integrations It would be helpful to understand the impact of this issue on the integrations listed above so that we can prioritize appropriately.
@jsoriano What might be helpful is to have the Beats source code to look at. Do you know where this lives?
Regarding impact, we have to take into account that these fields will be used as if they had no mapping, what uses to be a source of problems.
I think we are not seeing more issues related to this because the most visible issue of this is mappings conflict, and looking to the default rules, this will be only problematic if:
prometheus
at this point (and cockroachdb
, that is heavily based on prometheus
).Apart of the mapping conflicts, there may be more subtle consequences of not having proper mappings, at least:
text
and as keyword
multi-field, what can increase storage size).I think we need to go through the process to update the package-spec
schema for fields.yml files. This way we have a clear definition of the expected behavior of the options. This will help both the developers of integrations and implementors of the specification in Fleet.
(I've started updating the specification to match what the code actually does in Fleet EPM today by looking at the typescript code. https://github.com/andrewkroh/package-spec/commit/91d651386ab56398e5c7872e26e7ebf360c70b41)
(I've started updating the specification to match what the code actually does in Fleet EPM today by looking at the typescript code. andrewkroh/package-spec@91d6513)
Thank you, @andrewkroh. Agreed, we need to update the spec to match. Happy to have Fleet UI team help review once you have a PR.
(I've started updating the specification to match what the code actually does in Fleet EPM today by looking at the typescript code. andrewkroh/package-spec@91d6513)
Thanks @andrewkroh! Could you open a PR with this change so we can continue the discussion there?
In any case we need to support dynamic templates, and this brings the need of wildcards in names and supporting the object_type
options.
Here's that change as a PR to package-spec https://github.com/elastic/package-spec/pull/314. The goal there will be to sync up the spec to the current implementation. Then we can do a separate PR propose how dynamic mappings should work.
Hi, I am working on moving Cockroachdb integration to GA and in process of that I am getting error “field cannot be changed from type [float] to [long], dropping event!”. I found related SDH https://github.com/elastic/sdh-beats/issues/1809 which is similar to the issue am getting, Could you please confirm if this issue (dynamic template mapping) will resolve all such conversion errors and when is the fix expected?
Could you please confirm if this issue (dynamic template mapping) will resolve all such conversion errors and when is the fix expected?
Probably yes, CockroachDB package provides fields definitions (see here), but most of them are dynamic mappings, that are not being properly installed.
@joshdover @jlind23 do we have a timeline for this issue?
how dynamic mappings should work.
Regarding this, I think they should work the same way as in Beats. There may be limitations, but there are also strong advantages:
elastic-package
and many integrations already define dynamic mappings this way, as the mentioned case of CockroachDB.@andresrc This is a Fleet issue that hasn't been bubbled up in priority yet. Is this blocking some packages going GA? (like Cockroach DB?)
@jen-huang yes, the team is working on CockroachDB in this iteration, thanks
Bumped up to P1 per discussion with @akshay-saraswat as it is blocking CockroachDB integration from going GA.
Bumped up to P1 per discussion with @akshay-saraswat as it is blocking CockroachDB integration from going GA.
Thanks for prioritizing this!
Take into account that this is not only about CockroachDB. Grepping around integrations, many packages have at least one field with type: object
(119), object_type
(110) or a name with a wildcard (20). All of them should result on dynamic mappings, and they are not now, this can lead to unexpected mapping conflicts.
I guess we are not seeing more issues caused by this because we are being lucky on the use we do on these fields and default automatic mappings are working well for us (https://github.com/elastic/kibana/issues/129344#issuecomment-1087647490), but it may happen anytime that a customer faces issues because of this on corner cases or new integrations.
I've seen the same error in the Prometheus package and I've added a dynamic template to the data_stream manually to fix it. If I understood correctly, by fixing what's been discussed here, we wouldn't need to create the dynamic mapping at the package level, right?
related to the question above: would it be safe to add a dynamic_templates
to the data_stream manually, go to GA (prometheus integration is planned to go GA in 8.5) and remove dynamic_templates
later?
any timelines on the fix?
Adding the dynamic_templates
to the datastreams will work for now, but it will create a little more work when we generate mapping and implement dynamic_templates
for wildcard field names as we will have to merge two source of dynamic_templates
(but I guess we will have to do so as some integrations are already using this)
any timelines on the fix
I am currently looking at this and it's seems not it's not a 1 for 1 change from the beats code to Fleet as we generate mappings a little differently, but I should be able to have a POC/draft PR early this. This change seems a little risky to me as it will impact a lot of package and we probably need to decide if it's a bug fix for 8.4 or an improvement in 8.5 cc @jen-huang
@nchaulet Thanks for investigating. Let's continue working on the fix for this and then assess the risk from there. My feeling is that we may want to delay it to 8.5...
@jsoriano When you say we should create a dynamic template for field with a name with a wildcard it is a new feature? I am not able to the code path that does this in beats
@jsoriano When you say we should create a dynamic template for field with a name with a wildcard it is a new feature? I am not able to the code path that does this in beats
@nchaulet oh, you are right, I didn't notice. It only happens to be that almost all fields with a wildcard in their name also have type: object
in beats. TLDR; Yes, this would be a new feature, but probably it is worth to do it.
I say almost because I have found this one in Metricbeat, that effectively doesn't generate a dynamic mapping:
- name: pod.preemption.victims.bucket.*
type: long
description: Pod preemption victims
In integrations repo I see that there are more fileds with wildcards but without type: object
, for example https://github.com/elastic/integrations/blob/main/packages/awsfargate/data_stream/task_stats/fields/fields.yml#L28-L32
- name: core.*.pct
type: scaled_float
Such field definitions both in beats and fleet generate static mappings with *
as property names, that I don't think is what the developer expects.
I think we should do something about this, so developers have what they probably expect. I see these options:
core.*.pct
field definition would be interpreted as if it had type: object
and object_type: scaled_float
, and would generate a dynamic mapping. This would be actually a new feature. It would support a more compact and intuitive syntax, and would support fields currently defined this way.type: object
. This would be to ignore such fields in Fleet, and to add semantic validation in the package-spec to reject these fields. This would be disrupting for current integrations with such fields as validation would start failing for them and manual changes would be required.From these options I would prefer the first one, so we would have a better syntax for many of these use cases, and it would support the integrations that are already using it.
Btw, while looking at this I have found that Beats generates both, the dynamic mapping and an static mapping with the wildcard, this is probably unexpected, I will open another issue in Beats about that.
Integration packages have support for configuring wildcard field names in data streams. These fields are supposed to have an associated
dynamic_template
added to their mappings, however today, Fleet does not generate mappings for these. AFAICT, we have never had support for this in Fleet EPM code.From @jsoriano:
Here's an example of how this works in Beats. A field defined like this:
Results in a dynamic template added to the mappings like this:
We should be able to follow the Beats implementation to make this match on Fleet: https://github.com/elastic/beats/blob/ea207346d651448b8917b0791b2b117b9f9b9212/libbeat/template/processor.go#L444
We have several integrations using this feature today, but it's likely broken for customers due to this problem. In my rudimentary search on the integrations repo, I found that this is affecting these integrations: