Separate internal from custom application metricsets

axw commented 2 years ago

In 8.3 APM Server will begin storing well-defined metrics, like system.memory.total and jvm.memory.heap.used in metrics-apm.internal-<namespace> (https://github.com/elastic/apm-server/issues/7520). Previously, these metrics would be stored in metrics-apm.app.<service.name>-<namespace>, which led to memory issues in Elasticsearch at scale.

The internal metrics data stream uses strict mapping, whereas the service-specific metric data streams rely on dynamic mapping. Thus, we must ensure no unknown metrics are indexed into the internal metrics data stream.

To make all this happen, APM Server now has a hard coded list of metrics defined: https://github.com/elastic/apm-server/tree/main/apmpackage/apm/data_stream/internal_metrics/fields. When APM Server receives a metricset, it will look at all the metrics defined within it: if they are all well-defined, then the metricset is sent to the internal data stream; otherwise it is sent to a service-specific data stream. Thus, if an agent begins sending a new metric, APM Server will begin sending the entire metricset to a service-specific data stream.

I propose that we instead give agents the ability to inform APM Server that a metricset is intended to contain only "internal" metrics, i.e. metrics defined by the agent, and not by the application. APM Server would then discard any unknown metrics from internal metricsets, and send the remainder to the internal metrics data stream. This has been POC'd in https://github.com/elastic/apm-server/pull/8064. Without agent changes, the behaviour would the same as in 8.3.

8.3

If an agent sends a metricset with system.memory.total, then APM Server will index that metricset into metrics-apm.internal-<namespace>
If an agent sends a metricset with system.memory.total and foo.bar.baz, then APM Server will index that metricset into metrics-apm.app.<service.name>-<namespace> due to the unknown metric

Proposal

If an agent sends metricset without internal defined, with system.memory.total and foo.bar.baz, then APM Server will index that metricset into metrics-apm.app.<service.name>-<namespace> due to the unknown metric
If an agent sends a internal: true metricset with system.memory.total and foo.bar.baz, then APM Server will discard foo.bar.baz and index the metricset with just system.memory.total into metrics-apm.internal-<namespace>

axw commented 2 years ago

@felixbarny @Mpdreamz when you can, please take a look and let me know what you think.

felixbarny commented 2 years ago

I think this makes sense but I want to highlight a few challenges with this:

We only stop creating service-specific metric data streams if folks update their agents.
The premise of this change is that having more granular data streams, such as service-specific ones is bad. This is in contrast to the recommendations we got from the Elasticsearch team who said that as long as the number of data streams is in the low thousands, it should be fine given the recent optimizations.

Mpdreamz commented 2 years ago

The premise of this change is that having more granular data streams, such as service-specific ones is bad. This is in contrast to the recommendations we got from the Elasticsearch team who said that as long as the number of data streams is in the low thousands, it should be fine given the recent optimizations.

While this is true the general guideline to put data of the same type in the same datastream still applies. service.name having a cardinality in the range of 1-2000 does not help but we don't have an overarching grouping concept.

If an agent sends a internal: true metricset with system.memory.total and foo.bar.baz, then APM Server will discard foo.bar.baz and index the metricset with just system.memory.total into metrics-apm.internal-

I'm not a big fan of silently dropping data, can we include the dropped fields as keyword[] in the resulting metric document?

Rather than the agent controlling this through a special flag can we assume intent if the agent only sends system.* and one of (go|jvm|ruby|...).* fields that it should be internal?

e.g in Java we only want to create a service specific metric datastream if they expose application metric through e.g MicroMeter these do not live under any of the internal metric field prefixes.

axw commented 2 years ago

We only stop creating service-specific metric data streams if folks update their agents.

@felixbarny This shouldn't be the case. In 8.3, APM Server will have hard coded the known agent-produced metrics. If an agent doesn't specify internal (e.g. because it's an older agent), then APM Server will look through all the metrics for a metricset and consider it internal if all the metrics are known.

The premise of this change is that having more granular data streams, such as service-specific ones is bad. This is in contrast to the recommendations we got from the Elasticsearch team who said that as long as the number of data streams is in the low thousands, it should be fine given the recent optimizations.

I really would have liked to stick with service-specific data streams, but we've had several reports of high heap usage in Elasticsearch for clusters with hundreds of services, with Elasticsearch as recent as 8.2. I don't think we should give up on this. We can come back to it in a future release, and do our own scale testing.

I'm not a big fan of silently dropping data, can we include the dropped fields as keyword[] in the resulting metric document?

@Mpdreamz why should we treat unknown metrics any differently to other unknown fields sent by agents? Is this just for observability? Would logging the dropped metrics satisfy the same requirement?

Rather than the agent controlling this through a special flag can we assume intent if the agent only sends system. and one of (go|jvm|ruby|...). fields that it should be internal?

e.g in Java we only want to create a service specific metric datastream if they expose application metric through e.g MicroMeter these do not live under any of the internal metric field prefixes.

The issue I see with this is that system, go, jvm, etc. are all rather generic prefixes, and likely to show up in other metrics frameworks. For example, OpenTelemetry has metrics prefixed with both system. and jvm., but they shouldn't (currently) be considered "internal".

The other problem is that some agents group metrics by labels (dimensions), such that both custom and builtin metrics with the same labels (no labels) end up in the same metricset. The Go agent does this. We could stop doing that, but forcing the agent to also group by internal/custom makes it more explicit.

Mpdreamz commented 2 years ago

@Mpdreamz why should we treat unknown metrics any differently to other unknown fields sent by agents? Is this just for observability? Would logging the dropped metrics satisfy the same requirement?

Our leniency in API layer for intake is mostly to ensure newer agents sending data to an older apm-server does not break.

Here the APM server takes a decision to ignore a custom application metric even if both are on the same version if an agent misdiagnoses a metric document as internal or not.

A log is always good however if we store dropped the metric names we can actively query for misconfigurations and surface this.

The issue I see with this is that system, go, jvm, etc. are all rather generic prefixes, and likely to show up in other metrics frameworks. For example, OpenTelemetry has metrics prefixed with both system. and jvm., but they shouldn't (currently) be considered "internal".

Thanks for highlighting that. I am uneasy with the producer doing the classification of the metric as internal or not. Seems like a server concern bleeding into the agents. It sounds like we simply can not make this distinction on the server today?

@felixbarny This shouldn't be the case. In 8.3, APM Server will have hard coded the known agent-produced metrics. If an agent doesn't specify internal (e.g. because it's an older agent), then APM Server will look through all the metrics for a metricset and consider it internal if all the metrics are known.

While this helps it does require the agent and server to be in sync, if an agent is updated ahead of server we'll create short lived service specific datastreams before apm server is updated too.

So I think I just argued myself into the agents having to do this classification as initially suggested :smile_cat:

axw commented 2 years ago

Our leniency in API layer for intake is mostly to ensure newer agents sending data to an older apm-server does not break.

Here the APM server takes a decision to ignore a custom application metric even if both are on the same version if an agent misdiagnoses a metric document as internal or not.

Agents mislabelling custom application metrics as internal seems like quite an edge case, so I'm still not sure convinced it would be worth storing them in Elasticsearch.

A log is always good however if we store dropped the metric names we can actively query for misconfigurations and surface this.

That's a fair point. Can we defer storing them until we actually plan to query and display in the UI, and just log in the mean time?

simitt commented 2 years ago

The apm agents are collecting the metrics, therefore IMO it fits very well that the apm agents make the distinction of whether a metricset contains custom metrics or not, rather than the server. With that in mind, the server should not have to maintain a static list about internal metrics. The reason for maintaining this list in the sever, is to be able to adhere to the strict mapping. I'd like to raise the question again if it is worth having a strict mapping on internal metrics.

I propose a slightly changed solution: APM agents send a flag per metricset and the apm-server sends the metricset to the internal or app specific datastream. The metrics-apm.internal.<namespace> datastream defines a mapping for the known metrics at the time of the release, but uses dynamic mapping for forwards-compatibility. To avoid the chance of introducing mapping conflicts on the internal datastream, apm agents must implement the same definition of internal metrics and should fetch this from a central place. With this change, customers would not be required to upgrade their apm server (and therefore also ES & Kibana) when they want to consume newly introduced internal metrics. For backwards compatibility with older agents, the apm-server would ofc need to keep the current logic, but the internal list would not be maintained & updated and be eventually faded out.

Bonus: The proposal still doesn't solve the issue that service specific datastreams are introduced as soon as any non-internal metrics are collected. Therefore extending a bit on it. Instead of the internal flag, agents could send a service.group that would define the data stream to which the apm-server sends the metrics, metrics-apm.app.<service.group>-<namespace>. The service.group would be set to service.name by default, but eventually it would be configurable, allowing users to define which metricsets fit together in a data stream. Using something generic like service.group could allow extending this idea to traces-* and logs-* in the future. It also holds the risk of introducing something before having a full concept on how events should be grouped together in the future, so it might also be a bit premature.

elastic / apm

Separate internal from custom application metricsets #647