Open webmat opened 5 years ago
cc @MikePaquette @MarkSettleES @ruflin
@jasontedor @yaronp68 per the initial comment above, we are looking for a way to measure adoption of Elastic Common Schema (ECS) using our telemetry.
Mat's initial idea is to search all user index mappings for the presence of an ECS-defined mandatory field, called ecs.version
and report back on the number of indices whose mappings contain this field.
Additional aggregated statistics about the presence and values of this field in documents in user indices would also be interesting to keep, if possible and within our set of currently acceptable sources of telemetry information.
We would welcome your general thoughts about the feasibility and approach of this idea. Thanks!
Is the field _meta.ecs.version
or _meta.version
? Am I misreading the template which appears to show _meta.version
?
I see, thanks for clarifying. We can indeed extract whether or not that field is defined in the mapping. There will be a small number of false positives but I think we can all accept this? I am concerned about core Elasticsearch extracting something it doesn’t know about or have control over, but I have some approaches to explore there I might be okay with. I am also quite concerned about running queries/aggregations (especially cardinality) in telemetry calls. What magnitude of events are we talking about here?
All 7.0 Beats ship with this data so the base scale is 7.x Beats deployment + indices with ECS data created by the users. We have ILM on by default in 7.0 but we will in some cases still taking about large datasets with many indices. For the Beats indices we can read it out of the template but that is not necessarily the case for user provided data.
@jasontedor My current assumption is that all the logic / queries would be in Kibana so Elasticsearch would not require any specific knowledge about it.
If we're not already doing aggregations on big indices in Telemetry, perhaps we shouldn't start doing so now. I expect (hope) most of our customer's biggest indices in monitoring/security use cases to become ECS indices eventually.
I expect the results cardinality to be very low, however. There should be one or two aggregation results on ecs.version in most cases. Likely one specific version and possibly events without a version, if there are bugs. At least for Beats, which creates different indices for every version.
This could perhaps go to 3-5 results in the case of third party or custom solutions using ECS, if they approach things differently.
Not sure if this changes anything.
Thanks for clarifying @ruflin. I was brought to this thread because there was an ask to do this in Elasticsearch. I’m fine with it being in Kibana, and prefer it that way. I still have concerns on running queries/aggregations in Telemetry calls.
Describe the feature:
The ECS team would love to be able to have telemetry on ECS usage.
All ECS compatible events must have field
ecs.version
. So our first idea would be to base this telemetry on the presence of the field in mappings.I'm assuming we don't collect index names for privacy reasons. Given that premise, the ideal implementation of this telemetry would be
ecs.version
vs the overall count of indicesecs.version
, including the documents whereecs.version
is missing, of course.Describe a specific use case for the feature:
This is the most direct KPI to track the adoption of ECS.