cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
29.92k stars 3.78k forks source link

kv: introduce memory accounting / limits for lock table #49387

Open knz opened 4 years ago

knz commented 4 years ago

There's a limit on the max number of locks per lock table; however:

Context:

@knz: nathan sumeer where is the memory usage of the lock table accounted for? What's the metric? What's the memory monitor that it's accounted to? @nvanbenschoten: Like all of KV, it’s not connected to a memory monitor. The accounting here is pretty naive. We limit the size of each lockTable to a maximum number of locks and don’t allow it to grow beyond that. In my testing, this prevented the memory footprint from growing above 3MB per Range. There’s plenty of room for improvement here. @knz: there's a limit on the number of locks, but is there also a limit on the number of waiters per lock? Otherwise, with just 1 lock I can get an arbitrary large data structure @sumeerbhola : one place where this naive accounting would break is when the number of locks is below the limit, but the length of each lock queue is big. @nvanbenschoten : the waiters per lock is bounded by the number of requests in the system @knz : 3MB per range x 100 hot ranges = 300MB. That's enough to make a node that's close to the brim in RAM usage fall over (edited) @nvanbenschoten The first step here is to make this memory accounting per-store instead of per-range

Jira issue: CRDB-4232

knz commented 4 years ago

NB: number one priority is to add a metric.

It had been identified in #44976 but not done yet (also there was no issue filed yet)

knz commented 4 years ago

a thought: it's a bit odd that if I have 100000 ranges each containing 1 row, I can have 100000 lock entries, but if I have just 1 range containing 100000 rows, my lock table will just contain 1000 entries (because that's the per-range limit)

(I actually don't know what the specific limit is, but you get my drift- the mechanism shouldn't be different depending on how rows are spread across ranges)

it makes it a bit hard to explain to users IMHO

github-actions[bot] commented 1 year ago

We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!