Open jwangace opened 5 days ago
where runs vitess-operator with vitess v16.
@jwangace thank you for the report! Seeing that v16 is unsupported, could you please clarify whether the bug still appeas on supported versions (v19, v20, v21 at this time)?
Hi @shlomi-noach as you might have noticed, I also put a fix proposal PR in the latest code, unfortunately because we don't have any v22 deployments so I did not reproduce that on v22, however I cross compared related function (in which I proposed to update execSelect) and I believe this bug should present up to the current.
Do you think this is something PlantScale can verify by following Reproduction Steps?
@jwangace thank you, let us take a look!
Overview of the Issue
This is reproduce-able on k8s deployments, where runs vitess-operator with vitess v16. When consolidator been enabled, run select query concurrently at large scale through vtgate, and vttablet container get OOMKilled.
Consolidated Query Wait Count (vttablet_waits_count)
OOMKilled Metrics
Reproduction Steps
To easier reproduce this issue, you can: 1) set relatively small memory for vttablet container (limit at 1Gi for example) 2) craft a select query, and make the size of returned relatively large (5Mi for example) 3) run above select query concurrently at large scale through vtgate (10,000 queries for example) 4) observe vttablet OOMKilled
Binary Version
Vitess 16 and after versions.
Operating System and Environment details
Log Fragments