elastic / apm-server

https://www.elastic.co/guide/en/apm/guide/current/index.html
Other
1.21k stars 516 forks source link

monitoring: apm-server only ships a subset of monitoring metrics #13475

Closed carsonip closed 1 month ago

carsonip commented 2 months ago

Discovered by #13244

EA managed apm-server is only shipping a subset of apm-server monitoring metrics. This limits observability of EA managed apm-server. The problem is spread over the following use cases:

Unknowns:

carsonip commented 1 month ago

Testing

Completed local testing on all https://github.com/elastic/beats/pull/40127, https://github.com/elastic/elasticsearch/pull/110568, and https://github.com/elastic/integrations/pull/10414. Note that both https://github.com/elastic/elasticsearch/pull/110568 and https://github.com/elastic/integrations/pull/10414 require https://github.com/elastic/beats/pull/40127 to have output.elasticsearch.* metrics parsed.

apm-server self monitoring (indices .monitoring-beats-7-*)

before

image

after

image

metricbeat beat-xpack (DS .monitoring-beats-8-mb)

before

image

after

image

metricbeat beat standalone (DS metricbeat-*)

before

image

after

image

EA agent monitoring (DS metrics-elastic_agent-apm_server-*)

before

image

after

image

Long term solution

While the Python script in https://github.com/elastic/apm-server/pull/13638 generates the correct mapping for all the above use cases, this approach is not very maintainable. Ideally, having dynamic: true like metricbeat standalone enables apm-server monitoring fields to be mapped dynamically and will require minimal maintenance. However, since EA agent mapping already has TSDB enabled, disabling TSDB now and enabling dynamic mapping sounds like a step back.

lahsivjar commented 1 month ago

Tested on 8.15 BC2, (UPDATE) followup testing with 8.15 BC3:

carsonip commented 1 month ago

Moved all remaining tasks to #13731 . Closing this issue as the bug of missing metrics is now fixed.