elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.56k stars 24.62k forks source link

Create a default mapping template for metrics #72536

Open exekias opened 3 years ago

exekias commented 3 years ago

With dynamic templates in bulk requests (https://github.com/elastic/elasticsearch/pull/69948) merged, we are considering the option of creating a default template to allow for dynamic metrics ingestion. This would be leveraged by many Elastic Agent integrations, and is specially interesting for the ones reporting dynamic metrics (where we only get to know the metric / field names at runtime, and hence, we cannot define a mapping before ingestion).

The overall idea is to map all possible combinations of <metric_type> and <unit>, so each combination gets a unique predictable name that can be referenced at ingest time. For example:

{
  "mappings": {
    "dynamic_templates": [{
      "gauge_double_s": {
        "mapping": {
          "type": "double",
          "meta": {
            "metric_type": "gauge",
            "unit": "s"
          }
        }
      }
    },
    {
      "gauge_double_byte": {
        "mapping": {
          "type": "double",
          "meta": {
            "metric_type": "gauge",
            "unit": "byte"
          }
        }
      }
    },
    ...]
  }
}

The obvious concern is: We would be creating a dynamic mapping entry per every combination of and , which will explode into many entries:

So the main question is: Would this be considered a good practice? Can it cause any issues because of a too large mapping template? Perhaps we should do this in a different way?

elasticmachine commented 3 years ago

Pinging @elastic/es-search (Team:Search)

elasticmachine commented 3 years ago

Pinging @elastic/es-core-features (Team:Core/Features)

jpountz commented 3 years ago

@exekias I can imagine the number of dynamic templates growing very quickly using this approach so I'd like to take some time to look into whether Elasticsearch can make it easier.

One idea I'm considering is giving clients the ability to override metadata as part of bulk requests similarly to how we allow selecting a dynamic template, something like this:

POST /_bulk
{ "create" : { "_index" : "my_index", "_id" : "1", "dynamic_templates": {"system.cpu.user.pct": "scaled_float_100"}, "meta": {"system.cpu.user.pct": { "unit": "percent", "metric_type": "gauge" }}} }
{ "system": { "cpu": { "user": { "pct": "22.7" }}}}

With the dynamic_templates option, we avoided giving the ability to pass arbitrary mappings as otherwise users who only have the create_doc and auto_configure privileges could effectively put arbitrary mappings even though they don't have the manage privilege. Maybe it's ok to make an exception for the meta of mappings given that it never affects how fields get indexed. But then we'd need to make sure it remains that way.

cc @csoulios given the ongoing discussion about leveraging meta to configure rollups cc @tvernum do you have an opinion about the security implications?

tvernum commented 3 years ago

Because other stack components rely on meta being meaningful, I think there are some implications if we allow all users to set the metadata too freely. Or more accurately, I'm not confident that we can be sure there are no implications. It might be fine, but the system is sufficiently complex that it's hard to be sure.

I'd feel better if the definition of those dynamic templates had some way to indicate that they expected parameterized metadata, either a new flag (accept_custom_meta), or by making use of template variables in some way.

axw commented 3 years ago

or by making use of template variables in some way.

I really like this idea. I imagine a request would look something like:

POST /_bulk
{ "create" : { "_index" : "my_index", "_id" : "1", "dynamic_templates": {"system.cpu.user.pct": {"name": "scaled_float", "params": {"scaling_factor": 100, "metric_type": "gauge", "unit": "percent"}}} } }
{ "system": { "cpu": { "user": { "pct": "22.7" }}}}

and the dynamic_template (but probably with some conditional rendering of meta):

"scaled_float": {
  "mapping": {
    "type": "scaled_float",
    "scaling_factor": {{scaling_factor}},
    "meta": {
      "metric_type": {{metric_type}},
      "unit": {{unit}}
    }
  }
}
jpountz commented 3 years ago

I'm downgrading to team-discuss for the Search area to discuss whether to move forward with @tvernum + @axw 's proposal of templating.

joegallo commented 1 year ago

I'm removing the team-discuss label from some older Team:Data Management issues -- we've had plenty of time to discuss them, but we haven't, so the label isn't serving its purpose. Feel free to delete this comment and/or re-add the team-discuss label.

elasticsearchmachine commented 1 year ago

Pinging @elastic/es-data-management (Team:Data Management)

felixbarny commented 8 months ago

When working on https://github.com/elastic/elasticsearch/pull/104037, this issue came up again. It is particularly relevant for dynamically mapping the unit for OTel metrics to Elasticsearch mapping metadata.

In addition to @axw's proposal (https://github.com/elastic/elasticsearch/issues/72536#issuecomment-831697480), I'd like to propose another alternative that is based around encoding the unit in the field path and then reading out the unit from the field name in the meta.unit field of the metric itself.

We could store metrics in a metrics.<type>.<unit>.* namespace, for example metrics.counter.ms.my_counter.

With a dynamic template like this, we could then read out the ms part and attach it to unit.meta for the counter:

{
  "mappings": {
    "dynamic_templates": [
      {
        "counter": {
          "path_match": "metrics.counter.*.*",
          "mapping": {
            "type": "{dynamic_type}",
            "time_series_metric": "counter",
            "meta": {
              "unit": "{parent.name}"
            }
          }
        }
      }
    ]
  }
}

This introduces a new {parent.name} template variable for dynamic templates, in addition to the existing ones {name} and {dynamic_type}. Similarly to {name}, it reads out the name of a mapper. But in this case, it reads out the name of the parent mapper, which holds the information about the duration.

To make sure that you can still just do aggregations on my_counter, we could map the parent object that encodes information about the unit (for example metrics.counter.ms) to the passthrough field type (see https://github.com/elastic/elasticsearch/pull/103648). Maybe we'd also want to introduce a way to somehow hide the physical name metrics.counter.ms.my_counter and only expose my_counter in the field caps API.

The dynamic mapping for the unit field could look like the following:

{
  "mappings": {
    "dynamic_templates": [
      {
        "units": {
          "path_match": "metrics.*.*",
          "path_unmatch": "metrics.*.*.*",
          "match_mapping_type": "object",
          "mapping": {
            "subobjects": false,
            "type": "passthrough"
          }
        }
      }
    ]
  }
}
felixbarny commented 8 months ago

I had a chat about the two proposals with the @elastic/es-storage-engine team.

@kkrik-es felt like that we should maybe defer support for dynamically specifying the unit for now and rather build a dedicated metrics intake API in Elasticsearch first, possibly OTLP. This is not the only reason why a dedicated metrics API would make sense but it would probably also make it easier to support units compared to building something that relies on _bulk.

We've also discussed whether changing the internal structure of how we store metrics may be considered a breaking change. For example, if we store metric in metrics.<metric_name> or metrics.<metric_type>.<metric_name> now but change that to metrics.<metric_type>.<unit>.<metric_name> later on. Ideally, such changes would not constitute a breaking change but as we're exposing the object structure in _source and via ingest pipelines, users may depend on the structure of the documents.

Maybe we shouldn't allow any features for OTel metrics that rely on the physical structure of the ES documents (such as ingest processing, and exposing _source) and instead rely on OTLP as the stable interface. But that would imply a big shift to what we've done so far.

If we consider structural changes to be a breaking change, there's some urgency to get the mapping right soon.

axw commented 8 months ago

@kkrik-es felt like that we should maybe defer support for dynamically specifying the unit for now and rather build a dedicated metrics intake API in Elasticsearch first, possibly OTLP. This is not the only reason why a dedicated metrics API would make sense but it would probably also make it easier to support units compared to building something that relies on _bulk.

++ If I were to choose, I'd just implement OTLP support. Otherwise we'll still need to adapt OTel metrics to whatever we build -- so unless there's some additional metric features that don't exist in OTel, I don't see the point.

Maybe we shouldn't allow any features for OTel metrics that rely on the physical structure of the ES documents (such as ingest processing, and exposing _source) and instead rely on OTLP as the stable interface. But that would imply a big shift to what we've done so far.

++ A big shift indeed, and potentially far-reaching. e.g. it could also cover removing the need for _id by not exposing metric documents at all, and removing the ability to expose individual data points -- so in essence all searches over metrics/TSDS would be aggregations over time. That could allow things like automatic transformation of delta to cumulative metrics, and rolling up metrics at arbitrary points in time (e.g. at ingest, segment merge, or query time).

felixbarny commented 8 months ago

it could also cover removing the need for _id by not exposing metric documents at all, and removing the ability to expose individual data points -- so in essence all searches over metrics/TSDS would be aggregations over time.

I like that but it seems it would make hard to migrate to that because it breaks a lot of assumptions.

That could allow things like automatic transformation of delta to cumulative metrics, and rolling up metrics at arbitrary points in time (e.g. at ingest, segment merge, or query time).

Why can't we do that now and why does not exposing individual data points enable us to do that?

axw commented 8 months ago

Why can't we do that now and why does not exposing individual data points enable us to do that?

It comes down to user expectations: if you expect to be able to get out of ES exactly what you put in, then it precludes the kind of automatic transformations I'm thinking of. If on the other hand you are happy to record a data point in a time series, and only ever get back aggregations with some bounded granularity, then it's fine. It wouldn't really matter if the storage is cumulative, delta, rolled up at ingest time, etc., as it would be hidden from the user.

felixbarny commented 8 months ago

I agree that we should be able to do that but I don't think we're blocked to do that now. I'd argue that with rollups, we've kind of broken the glass already. Configuring rollups is currently a conscious choice made by the user. However, I don't think we'd be blocked from enabling rollups by default in integrations, for example.

felixbarny commented 6 months ago

Another idea that came to mind is that we could allow accessing field values of the document via template variables. We could then store the unit in a top-level unit field and access it like this in a dynamic template:

{
  "counter": {
    "mapping": {
      "type": "{dynamic_type}",
      "time_series_metric": "counter",
      "meta": {
        "unit": "{doc[unit]}"
      }
    }
  }
}
elasticsearchmachine commented 2 months ago

Pinging @elastic/es-search-foundations (Team:Search Foundations)