Closed cmacknz closed 5 months ago
Pinging @elastic/elastic-agent (Team:Elastic-Agent)
@pchila thanks for your diligence on this issue. Would it be possible to have a benchmark on what the savings we could expect from this change?
cc: @pierrehilbert
Reopening this issue as the second part of the acceptance criteria isn't actually done yet AFAICT:
The data storage savings after removing this metricset are calculated and included in the release notes
Also related to @nimarezainia's question in the previous comment.
@pchila thanks for your diligence on this issue. Would it be possible to have a benchmark on what the savings we could expect from this change?
@cmacknz did a quick check on the data savings here on the PR https://github.com/elastic/elastic-agent/pull/4579#issuecomment-2060208711
I will re-run 2 versions of agent (with and without the change) and check the index size and document count
@cmacknz did a quick check on the data savings here on the PR #4579 (comment)
I will re-run 2 versions of agent (with and without the change) and check the index size and document count
Thanks. Could you make a small PR to update https://github.com/elastic/elastic-agent/blob/fd7984b1d70dc968ba67fb8f4221905e508d6a06/changelog/fragments/1713257367-Remove-beat-state-metricset-from-elastic-agent-monitoring.yaml#L19 with these savings numbers?
@nimarezainia @ycombinator Re-measured index size difference between commit 1e88a9448f93499fea0e59672de9d6c80edc53c4 (commit just before the change) and commit 0d31445bfd5bdb108a5abf0b1cec4fe9fd3c3a1b (merge commit of the related PR) for a 10 min period after startup.
In both cases I used a policy that included the System Integration and agent logs and metrics collection.
Here's the sizes of the reindexed documents
Document count for metrics-elastic_agent.filebeat-*
and metrics-elastic_agent.metricbeat-disksize.baseline
is down by 50% (as expected removing half of the metricsets) with a size on disk gain of ~13% for both indices
I am gonna put up a small PR with the changelog patching and link it to this issue
In that same PR, can you add something under the doc directory describing how to reproduce these test results?
@cmacknz I used a script that is part of PR #4633 for extracting and reindexing logs and metrics but it's not merged yet
Sure, doesn't matter when or how it gets documented then, as long as we have a way to remember what we did if we want to re-evaluate this again later.
Isn't the number of metrics produced dependent on the number of components running under agent? i.e. something like x document per beat per interval? so the % savings depends on the number of deployed integrations/managed beats?
That is correct yes, more complex configurations will see greater savings. I assume @pchila likely tested this with the default system integration installed, I will comment on the changelog entry.
@strawgate @cmacknz edited my comment adding clarification on what policy I used for the test. This is the reason why I expressed the savings in % as the absolute numbers will scale with the number of impacted indices
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)
Our agent monitoring implementation currently uses the beat Metricbeat module to monitor Beat subprocesses. We collect both the stats and state metricsets.
https://github.com/elastic/elastic-agent/blob/b39b9af521fcbf1fcae6bab14762b0a120febdb7/internal/pkg/agent/application/monitoring/v1_monitor.go#L617-L625
It seems to me that nothing actually uses the data from the state metricset. We don't map the fields in the Elastic Agent integration. I believe we can remove this metricset and stop pointlessly storing this data for every Beat process we start.
We currently store both the state and stats metricset in the same datastream, and as such include the metricset name as a TSDB dimension which could probably be removed after this change.
https://github.com/elastic/integrations/blob/a2c55c4cbf752e0490f9fe2d3e68698517c7b74d/packages/elastic_agent/data_stream/elastic_agent_metrics/fields/ecs.yml#L21-L23
Acceptance Criteria: