Closed jnyi closed 1 year ago
Is this the full log? Looks like it is missing some data.
Potentially related https://github.com/thanos-io/thanos/issues/6190. However we have around 10.000 RPS against receivers and rarely have panics like this one.
Did you run Thanos Receive prior to v0.31.0-rc.0, and do you see any correlation with head compaction times? Also as Giedrius mentioned, I am not sure that the relevant error is present in the logs. The goroutine that caused the panic is the first one in the stack trace so we would need to see the beginning of the logs.
we tried v0.30.2 and switch to v0.31.0-rc.0 due to the other panic fix https://github.com/thanos-io/thanos/pull/6067 when receiver starts
You are right, somehow the k8s didn't print the full logs from previous terminated pod, this time I was able to capture logs in time and changed the description.
@jnyi I've published v0.31.0-rc.1. Could you try it out to see if it fixes the issue for you?
Yep, I've incorporated your change and so far we are not seeing the issue anymore
Awesome, thanks for confirming. I'll close this issue for now and feel free to reopen if it happens again.
Thanos, Prometheus and Golang version used: Thanos: v0.31.0-rc.0 Golang: go1.19.6
Object Storage Provider: MinIO for mock
What happened: See thanos ingestor panic with around ~100 query requests / second,
Receive using Ingestor only mode and its args:
Querier Args:
What you expected to happen: running smoothly
How to reproduce it (as minimally and precisely as possible): Not sure why this happened, we just start feed query loads and it happens
Full logs to relevant components:
Anything else we need to know: