Describe the bug
Once in a while, Loki becomes too slow and even the labels query fails. I have attached log lines from the query frontend, querier, and ingester
To Reproduce
Steps to reproduce the behavior:
helm ls
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
loki default 1 2024-11-14 03:01:14.312731379 +0300 +03 deployed loki-6.19.0 3.2.0
Environment:
Infrastructure: Kubernetes (10 nodes 1 master and 9 workers each node with 4 CPU, 8 RAM, and 70GB local storage for persistent storage for the pods)
Deployment tool: helm
Screenshots, Promtail config, or terminal output
This is the logs from the query-frontend
{"caller":"retry.go:107","code":"Code(500)","end":"2024-11-14T18:15:45.436Z","end_delta":"-57.250601969s","err":"rpc error: code = Code(500) desc = too many unhealthy instances in the ring","length":"5m0s","level":"error","msg":"error processing request","org_id":"fake","query":"{log_level=\"Verbose\"} |= \"\"","query_hash":3288896298,"retry_in":"4.596118935s","start":"2024-11-14T18:10:45.436Z","start_delta":"4m2.749397134s","traceID":"1fdbad8d04dbd3a4","try":2,"ts":"2024-11-14T18:14:48.185404545Z","type":"queryrange.LokiRequest"}
{"caller":"retry.go:107","code":"Code(500)","end":"2024-11-14T18:15:45.414Z","end_delta":"-52.786385373s","err":"rpc error: code = Code(500) desc = too many unhealthy instances in the ring","length":"5m0s","level":"error","msg":"error processing request","org_id":"fake","query":"{log_level=\"Verbose\"} |= \"\"","query_hash":3288896298,"retry_in":"4.464963017s","start":"2024-11-14T18:10:45.414Z","start_delta":"4m7.213613697s","traceID":"29cec51b93debe20","try":3,"ts":"2024-11-14T18:14:52.627635884Z","type":"queryrange.LokiRequest"}
{"cache_chunk_bytes_fetched":0,"cache_chunk_bytes_stored":0,"cache_chunk_download_time":"0s","cache_chunk_hit":0,"cache_chunk_req":0,"cache_index_download_time":"0s","cache_index_hit":0,"cache_index_req":0,"cache_result_download_time":"0s","cache_result_hit":0,"cache_result_query_length_served":"0s","cache_result_req":0,"cache_stats_results_download_time":"0s","cache_stats_results_hit":0,"cache_stats_results_req":0,"cache_volume_results_download_time":"0s","cache_volume_results_hit":0,"cache_volume_results_req":0,"caller":"metrics.go:223","chunk_refs_fetch_time":"2.837299ms","component":"frontend","congestion_control_latency":"0s","disable_pipeline_wrappers":"false","duration":"8.963556604s","end_delta":"-52.791194475s","index_bloom_filter_ratio":"0.00","index_post_bloom_filter_chunks":0,"index_shard_resolver_duration":"0s","index_total_chunks":0,"ingester_chunk_compressed_bytes":"24kB","ingester_chunk_decompressed_bytes":"268kB","ingester_chunk_downloaded":0,"ingester_chunk_head_bytes":"23kB","ingester_chunk_matches":1,"ingester_chunk_refs":0,"ingester_post_filter_lines":197,"ingester_requests":3,"latency":"fast","length":"5m0s","level":"info","limit":500,"lines_per_second":21,"org_id":"fake","pipeline_wrapper_filtered_lines":0,"post_filter_lines":197,"query":"{log_level=\"Verbose\"} |= ``","query_hash":2248683790,"query_referenced_structured_metadata":false,"query_type":"limited","queue_time":"227µs","range_type":"range","returned_lines":0,"shards":1,"splits":0,"start_delta":"4m7.208805422s","status":"200","step":"200ms","store_chunks_download_time":"0s","throughput":"32kB","total_bytes":"291kB","total_bytes_structured_metadata":"1.8kB","total_entries":84,"total_lines":197,"traceID":"1fdbad8d04dbd3a4","ts":"2024-11-14T18:14:52.644858332Z"}
The following attached images from the distributor pods ring status page
Describe the bug Once in a while, Loki becomes too slow and even the labels query fails. I have attached log lines from the query frontend, querier, and ingester
To Reproduce Steps to reproduce the behavior:
Environment:
Screenshots, Promtail config, or terminal output This is the logs from the query-frontend
The following attached images from the distributor pods ring status page
The following is my Loki configuration