apache / pinot

Apache Pinot - A realtime distributed OLAP datastore
https://pinot.apache.org/
Apache License 2.0
5.51k stars 1.29k forks source link

Create a dashboard for most expensive queries per table #6378

Open chenboat opened 3 years ago

chenboat commented 3 years ago

In many production settings, a Pinot cluster can handle thousand of queries per second. Expensive queries can greatly impact the overall system performance. Right now, users of Pinot have to rely on external mechanisms of pinpoint these costly queries. Namely, one has to export the broker query logs with query stats and then rely on tools like Kibana to filter and find those queries.

We propose to add query monitoring to Pinot controller console. The query monitoring console should show the most expensive (according to query stats like number of docs/entries scanned) queries per table in the past N days. One can use the dashboard to:

  1. Find out the most expensive queries per table.
  2. Help to find missing index config for the corresponding table.
  3. (Nice to have) allow users to create the indexes to improve query performance.

@icefury71 @Jackie-Jiang

kishoreg commented 3 years ago

why not write the query response back to Pinot. Have a logger that logs the query response to kafka topic and have that ingested into Pinot. We can build multiple visualizations from that

mcvsubbu commented 3 years ago

why not write the query response back to Pinot. Have a logger that logs the query response to kafka topic and have that ingested into Pinot. We can build multiple visualizations from that

We do that in Linkedin. See RequestStatistics. Feel free to add fields to it, but please do not rename the existing members :-)