rangefeed: tune catchup scan semaphores

cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.

https://www.cockroachlabs.com

Other

30.11k stars 3.81k forks source link

rangefeed: tune catchup scan semaphores #110439

Closed erikgrinaker closed 1 year ago

erikgrinaker commented 1 year ago

We have both client-side and server-side catchup scan semaphores. It isn't clear that these are tuned optimally, or even that we want them both on the client and server side. We should reconsider these.

Jira issue: CRDB-31428

Epic CRDB-26372

blathers-crl[bot] commented 1 year ago

cc @cockroachdb/replication

miretskiy commented 1 year ago

I think we should reconsider client side ones -- at least. We are currently using client side semaphore to pace goroutine creation -- we should instead use something like quota pool, or just simple timer instead to control the rate.

I think the many changes that happen on KV side -- from lock table, to use of TBI iterator, make the client side throttling unnecessary for the purpose of limiting the number of concurrent catchup scans.

erikgrinaker commented 1 year ago

On the client side, I guess we could read off what the server-side semaphore size is, and have a per-node semaphore of the same size. It might overshoot in the case of many clients, but that doesn't matter as long as the size is small.

aliher1911 commented 1 year ago

Reading server side size is a bit iffy. We have server side limiter which is per store. You could also be competing with other changefeeds. And if we remove client side limits what I think could go wrong is that if we create a rangefeed with lots of ranges, we could block the limiter by lots of queued rangefeed start attempts.