Closed Bamieh closed 1 year ago
Pinging @elastic/kibana-stack-services
Pinging @elastic/kibana-gis
After discussing this with @aaronjcaldwell @pickypg we agreed to restructure emsVectorLayersCount
and layerTypesCount
into an array instead of dynamic fields.
"emsVectorLayersCount": {
"properties" {
"key": { "type": "keyword"},
"min": { type: "long" },
"max": { type: "long" },
"avg": { type: "long" },
}
},
"layerTypesCount": {
"properties" {
"key": { "type": "keyword"},
"min": { type: "long" },
"max": { type: "long" },
"avg": { type: "long" },
}
}
We avoid dynamic fields in the telemetry mapping. Additionally frequently adding / changing fieldnames in ES is an anti-pattern.
We'd be using these fields to check overall usage, which specific layers being adopted and which layers are getting used the most.
To achieve a more granular aggregation we'll have to reindex the data, but usually the basic use-cases are enough. We can also automate the reindexing if that is needed to capture some of those aggregations.
The largest benefit is that we can add new layers and types of layers and there’s absolutely no work that needs to get done by anyone to accept it.
An action item for this is to refactor to the above data structure properties: { key, max, min, avg}
for the two dynamic telemetry fields the next time we make updates.
I just ran into this when trying to get some telemetry on our EMS usage 😄 . Any idea when we might be able to fix this? 7.3 doesn't seem to be possible at this point (but correct me if I'm wrong), what about 7.4?
Restructuring this on the code side shouldn't be too difficult. I'm happy to coordinate my work with the telemetry team's whenever makes sense!
@aaronjcaldwell please coordinate with @Bamieh for whatever you need re: telemetry.
Pinging @elastic/pulse (Team:Pulse)
Meeting summary about solving the dynamic: true
fields issues for maps (@aaronjcaldwell @afharo @TinaHeiligers )
We will aim to find a root solution to this issue with pulse. For now we dont have a go to solution for fields that have variables provided by the users.
Since the custom layers can be added by the users we dont have a pre-defined set of all the fields to map them.
Allowing dynamic fields of any kind in telemetry is not possible.
The current proposal is to implement a 3-step strategy of discovery, exploration and mapping.
This will be used strictly for discovering the layers provided by the users.
layer_key
is a combination of ${name}__${type}
.
{
"mappings": {
"dynamic": false,
"properties": {
"maps": {
"properties": {
"layers": {
"properties": {
"layer_key": {
"type": "keyword"
},
"layer_name": {
"type": "keyword"
},
"layer_type": {
"type": "keyword"
}
}
}
}
}
}
}
}
layer_key
, layer_name
, or layer_type
to discover the layers are provided by all our users.{
"size": 0,
"aggs": {
"layers": {
"terms": {
"field": "maps.layers.layer_key"
}
}
}
}
The Maps team will need to use this data to decide which user-defined layer types and names to gather telemetry from and request an update to the mapping through the pulse team.
layerTypesCount
and emsVectorLayersCount
. This way we can control the fields that are mapped, while still keeping the dynamic: false
property of telemetry.{
"mappings": {
"dynamic": false,
"properties": {
"maps": {
"properties": {
"layers": {
"properties": {
"layer_key": {
"type": "keyword"
},
"layer_name": {
"type": "keyword"
},
"layer_type": {
"type": "keyword"
}
}
}
}
}
}
}
}
Mapping these new fields requires the requesting team (in this case, the Maps team) to open an issue in the telemetry repo for the pulse team to handle. Although this will probably require some back and forth to map new layers to analyze them, it is the only plausible solution we could come up with for now.
It is worth exploring adding an analyzer for the layer_key
field to be able to search on the layer_name, or type while still maintaining the relationship between the two since non-nested arrays cannot be queries in that manner.
We have played around with using dynamic templates. The approach is great as it allows us to map fields that can follow a strict pattern. This does not work in the maps situation since users can be sending arbitrary layer names, hence re-creating the main issue why dynamic mapping is set to false
in the first place.
{
"mappings": {
"dynamic": false,
"properties": {
"maps": {
"type": "object",
"dynamic": true
}
},
"dynamic_templates": [
{
"mapsFields": {
"path_match": "maps.count_*",
"mapping": {
"key": {
"type": "keyword"
},
"min": {
"type": "long"
},
"max": {
"type": "long"
},
"avg": {
"type": "long"
}
}
}
}
}
}
]
}
@Bamieh do you think we can close this issue now that we have the schema
definitions?
Maybe @mindbat wants to provide some input regarding this piece of telemetry?
Pinging @elastic/kibana-core (Team:Core)
@afharo Perhaps with the push for moving to Telemetry-Next, we can re-open the discussion of dynamic fields for the Maps team, via an issue on the Infra repo?
@alexfrancoeur What do you think?
@mindbat sure! My point is that the current schema regarding maps provides the DYNAMIC_KEY
keyword: https://github.com/elastic/kibana/blob/master/x-pack/plugins/maps/server/maps_telemetry/collectors/register.ts#L44-L50
If that's good enough for Telemetry-Next, I think we can close this issue.
Before closing, we probably want the Maps team to weigh in. While I'm fortunate to have a history with this amazing team, I'm no longer part of it. @kmartastic @thomasneirynck thoughts on the Telemetry thread here?
I'm not quite sure I understand what's being discussed. Actually, I'm quite sure I don't.
What is Telemetry-Next? Is that a project? schema? Is there a solution for working with dynamic fields within Telemetry-Next? We have a growing need for capturing this data as we are adding new layer types and want to track usage.
@alexfrancoeur @thomasneirynck @afharo
Hi @kmartastic,
Telemetry-Next is a new take to the way we store the data in the Remote Telemetry Service. Allowing dynamic keys, or even array structures, if preferred. For more details, reach out on the #telemetry-next Slack channel.
The schema
is how Kibana defines the "contract" of the usage reported by each collector so the Remote Telemetry Service knows what to expect and what's changed in each version.
Re this specific issue, Maps have 2 fields that are objects where the keys cannot be known beforehand. For now, we've worked around them when defining their schema by setting the key name DYNAMIC_KEY
. My question is whether we think that's enough for this use case or if we want to work further into a better structure/definition.
@alexfrancoeur can we close this issue?
Pinging @elastic/kibana-presentation (Team:Presentation)
I'll go ahead and close this issue. From the last few comments, it seems that both teams are asking each other if we could close it 😅
Feel free to reopen if you think that any additional work is needed.
Hello maps team :wave:
I was updating the telemetry index mapping yesterday for maps, the telemetry sent for maps has 2 dynamic fields. We do not accept dynamic fields so these fields will not be used by telemetry
I was hoping to understand what gets reported under those fields so we can map them differently to avoid dynamic fields.
Context: https://github.com/elastic/telemetry/pull/108