Closed agsh-yb closed 1 week ago
Why does this happen? xCluster uses the RocksDB rate limiter to throttle data transfer rate. There is one rate limiter on the source (to protect source tablets), and one on the target (to protect target tablets). The default rate limit is set to 100MBps (--xcluster_get_changes_max_send_rate_mbps). We check the rate on every batch we send, and typically batches are capped to 4MB (--consensus_max_batch_size_bytes) in size. COPY commands generate WAL ops that can get much larger, upto 255MB (--rpc_max_message_size). Since we cannot break apart WAL ops (they are an atomic commit batch) these are allowed to violate the xCluster 4MB batch limit. The RocksDB rate limiter has a bug which causes it to forever block calls that are greater than 10MB (100MBps100ms), so these GetChanges responses hang forever with the large memory that they allocated. There is a safety mechanism on the source that limits the number of in-flight GetChanges calls to 921 (FLAGS_rpc_workers_limit (1 - FLAGS_cdc_get_changes_free_rpc_ratio)) using a semaphore (get_changes_rpcsem). Depending on the amount of available memory, and threads we may run out of one resource, or hit the semaphore limit after which GetChagnes will fail with LeaderNotReadyToServeerror from cdc_service.cc:1460.
Jira Link: DB-12112
Description
Observed this on:
2.23.0.0-b574
In colocated database after setting up replication and performing following copy commands data is not replicated. The GetChanges RPCs were timing out
Logs in Jira ticket
Steps
Issue Type
kind/bug
Warning: Please confirm that this issue does not contain any sensitive information