elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.91k stars 24.73k forks source link

Better tooling/logs for troubleshooting long running CCS requests #73922

Open ppf2 opened 3 years ago

ppf2 commented 3 years ago

Troubleshooting long running cross cluster search requests is challenging especially in large environments with a lot of downstream clusters distributed across different regions.

In order to isolate, the users will have to extract the search request and run it against every downstream cluster separately and also against all clusters to compare the timings to see if any one or more clusters are slow, check monitoring stats/collect diagnostics for each downstream cluster while the query is executing to see if there's a bottleneck, etc.. In the async search case, test/compare using a regular search while toggling minimize roundtrip setting (on/off) for CCS to see if there could be a network latency issue.

The profile API has limitations and doesn't include things like network latency, send back times, time to reduce the results on coordinating nodes. We have something called transport tracers, but the disclaimer seems to suggest that it is not appropriate for live production debugging. Also, the output of the tracing doesn't include the search request details which makes it difficult to isolate and trace through the performance of a specific query.

It can be helpful to have to have an API or debug loggers that provide details that will help users determine

elasticmachine commented 3 years ago

Pinging @elastic/es-search (Team:Search)

javanna commented 2 years ago

This one is linked with #21073 and #84369 . Once we are able to break down the time spent processing a request to all the individual sub-tasks, we should think about how to connect tasks running on separate clusters that are though part of the same search execution. Same could be done for field_caps etc.

elasticsearchmachine commented 3 months ago

Pinging @elastic/es-search-foundations (Team:Search Foundations)