grafana / mimir

Grafana Mimir provides horizontally scalable, highly available, multi-tenant, long-term storage for Prometheus.
https://grafana.com/oss/mimir/
GNU Affero General Public License v3.0
4.02k stars 509 forks source link

Document how detailed query performance related metrics can be accessed via logs #1839

Open YourTechBud opened 2 years ago

YourTechBud commented 2 years ago

Is your feature request related to a problem? Please describe.

In a multi-tenant scenario, its quite possible that a single tenant can hog all the compute resources available for reads. There is no way to find out the query being fired / samples being processed for each tenant.

Describe the solution you'd like

All query performance related metrics should be exported tenant wise.

Describe alternatives you've considered

None actually.

Additional context

We have a very small setup for the time being (monolithic) and it isn't possible to just increase the compute resources available.

56quarters commented 2 years ago

I believe this is already possible with the "query stats" feature which is enabled by default. When enabled, the query-frontend logs a message for each query run with things like time taken for the query, number of series loaded, number of bytes fetched, etc. Can you take a look and see if this works for you?

https://grafana.com/docs/mimir/latest/operators-guide/configuring/reference-configuration-parameters/#frontend

Example output:

level=info ts=2022-05-10T13:25:14.433589943Z caller=handler.go:224 user=tenant-12345 traceID=0aab975889761ef5 msg="query stats" component=query-frontend method=GET path=/prometheus/api/v1/query response_time=5.78682ms query_wall_time_seconds=0.003301319 fetched_series_count=12 fetched_chunk_bytes=12345 fetched_chunks_count=1234 sharded_queries=8 param_end=1652189100 param_query="sum(some_metric) by (environment)" param_start=1652187300
osg-grafana commented 2 years ago

cc @osg-grafana to improve the documentation

YourTechBud commented 2 years ago

Thanks @56quarters. This will work for me. Leaving the issue open till the docs reflect this as well.

osg-grafana commented 11 months ago

Removing myself as assignee because I am not actively working on it and someone else can feel free to grab it.