Describe the bug
Loki query labels or logs timeout with 1 minute after Loki upgrade in recent weeks.
Previously the labels query was very quick, but now it hangs there and timeout when 1 minute.
time logcli labels --since=24h
2023/11/14 18:17:41 https://<url>/loki/api/v1/labels?end=1700014661166745910&start=1699928261166745910
2023/11/14 18:18:41 error sending request Get "https://<url>/loki/api/v1/labels?end=1700014661166745910&start=1699928261166745910": EOF
2023/11/14 18:18:41 Error doing request: run out of attempts while querying the server
real 1m0.022s
user 0m0.144s
sys 0m0.056s
Expected behavior
Labels query should be very quick
How to change this 1 minute timeout? I already set timeout=20m in config, see the end
How to solve these too many "context canceled"
Environment:
Infrastructure: Openshift
Deployment tool: helm
Detail
I got the same problem in the recent weeks after Loki upgrade. My Loki installed in Openshift(Kubernetes), v2.9.2.
Can only get recent hours labels, but failed for more duration and timeout error when 1 minute (not sure which controls this 1 minute timeout). I don't have the new Promtail label added. When querying label, there are some "context canceled" in query frontend, scheduler, querier pods, see below:
Describe the bug Loki query labels or logs timeout with 1 minute after Loki upgrade in recent weeks. Previously the labels query was very quick, but now it hangs there and timeout when 1 minute.
Expected behavior
Environment:
Detail
I got the same problem in the recent weeks after Loki upgrade. My Loki installed in Openshift(Kubernetes), v2.9.2.
Can only get recent hours labels, but failed for more duration and timeout error when 1 minute (not sure which controls this 1 minute timeout). I don't have the new Promtail label added. When querying label, there are some "context canceled" in query frontend, scheduler, querier pods, see below:
Frontend:
level=info ts=2023-11-15T02:10:26.847814729Z caller=frontend_scheduler_worker.go:107 msg="adding connection to scheduler" addr=172.21.13.20:9095 level=debug ts=2023-11-15T02:10:26.848385759Z caller=ring_watcher.go:93 component=frontend-scheduler-worker msg="removing connection to address: 172.21.10.68:9095" level=info ts=2023-11-15T02:10:26.848412309Z caller=frontend_scheduler_worker.go:134 msg="removing connection to scheduler" addr=172.21.10.68:9095 level=debug ts=2023-11-15T02:10:26.848508362Z caller=frontend_scheduler_worker.go:282 msg="stream context finished" err="context canceled" level=debug ts=2023-11-15T02:10:26.848570548Z caller=frontend_scheduler_worker.go:282 msg="stream context finished" err="context canceled" level=debug ts=2023-11-15T02:10:26.848612475Z caller=frontend_scheduler_worker.go:282 msg="stream context finished" err="context canceled"
Scheduler:
level=debug ts=2023-11-15T02:11:56.172109206Z caller=scheduler.go:405 msg="querier connected" querier=loki-loki-distributed-querier-5f9797d798-ckxbc level=debug ts=2023-11-15T02:11:56.172123388Z caller=scheduler.go:405 msg="querier connected" querier=loki-loki-distributed-querier-5f9797d798-ckxbc level=debug ts=2023-11-15T02:11:56.172323748Z caller=grpc_logging.go:76 method=/schedulerpb.SchedulerForQuerier/QuerierLoop duration=27.090504991s err="context canceled" msg=gRPC level=debug ts=2023-11-15T02:11:56.172371672Z caller=grpc_logging.go:76 method=/schedulerpb.SchedulerForQuerier/QuerierLoop duration=27.089859711s err="context canceled" msg=gRPC
Querier:
level=error ts=2023-11-15T02:14:10.520057737Z caller=scheduler_processor.go:106 msg="error processing requests from scheduler" err="rpc error: code = Canceled desc = context canceled" addr=172.21.98.34:9095 level=error ts=2023-11-15T02:14:10.520106449Z caller=scheduler_processor.go:106 msg="error processing requests from scheduler" err="rpc error: code = Canceled desc = context canceled" addr=172.21.98.34:9095 level=debug ts=2023-11-15T02:14:10.509859034Z caller=util.go:38 msg="querier worker context has been canceled, waiting until there's no inflight query" level=debug ts=2023-11-15T02:14:10.509993621Z caller=util.go:38 msg="querier worker context has been canceled, waiting until there's no inflight query" level=debug ts=2023-11-15T02:14:10.510809779Z caller=util.go:38 msg="querier worker context has been canceled, waiting until there's no inflight query"
my config below: