We have upgraded the Cassandra reaper tool from 3.2.0 to 3.6.0.
But we noticed a serious tombstone generation on multiple reaper Keyspace tables , and downgraded back to 3.2.0 , which I though of bringing up to your notice.
I'm opening the ticket first time in this community, so if any mis information or need more details let me know.
I found that this table is reading from one partition column and a non PK column with allow filtering , is anti pattern for cassandra, which will degrade casandra cluster performance & also lot of log messages pushed to kafka , which may cause network performance issues.
Once we downgraded to 3.2.0 the issue fixed. So its a bug from the new version of reaper 3.6 (I have not checked the versions between 3.2 to 3.6.)
I have tried reducing the gcgrace to 1 day & ran garbage collect , & tried to add TTL , but not helped.
Some of the tombstone queries found on the logs are listed below :
{"msg":"WARN [ReadStage-4] ReadCommand.java:536 - Read 2529 live rows and 2547 tombstone cells for query SELECT segment_state FROM cassandra_reaper.repair_run WHERE id = 52103b80-22e7-11ef-83b1-195f6f679993 AND segment_state = 2 LIMIT 5000; token 3000425142431644441 (see tombstone_warn_threshold)","pid":1780832,"fields":{"stream":"stdout"}}}
cassandra_reaper.repair_run
SELECT repair_unit_id, coordinator_host, end_token, fail_count, host_id, replicas, segment_end_time, segment_start_time, segment_state, start_token, token_ranges FROM cassandra_reaper.repair_run WHERE id = 18eddad0-193f-11ef-8f3e-eb3c8eea5a87 AND segment_state = 3 LIMIT 5000;
WARN [ReadStage-3] ReadCommand.java:536 - Read 1151 live rows and 1174 tombstone cells for query SELECT repair_unit_id, coordinator_host, end_token, fail_count, host_id, replicas, segment_end_time, segment_start_time, segment_state, start_token, token_ranges FROM cassandra_reaper.repair_run WHERE id = 18eddad0-193f-11ef-8f3e-eb3c8eea5a87 AND segment_state = 3 LIMIT 5000; token -4168918588338023632 (see tombstone_warn_threshold)
SELECT repair_unit_id, coordinator_host, end_token, fail_count, host_id, replicas, segment_end_time, segment_start_time, segment_state, start_token, token_ranges FROM cassandra_reaper.repair_run WHERE id = 18eddad0-193f-11ef-8f3e-eb3c8eea5a87 AND segment_state = 3 LIMIT 5000;
Table DDL -
CREATE TABLE cassandra_reaper.repair_run (
id timeuuid,
segment_id timeuuid,
adaptive_schedule boolean static,
cause text static,
cluster_name text static,
creation_time timestamp static,
end_time timestamp static,
intensity double static,
last_event text static,
owner text static,
pause_time timestamp static,
repair_parallelism text static,
repair_unit_id timeuuid static,
segment_count int static,
start_time timestamp static,
state text static,
tables set<text> static,
coordinator_host text,
end_token varint,
fail_count int,
host_id uuid,
replicas frozen<map<text, text>>,
segment_end_time timestamp,
segment_start_time timestamp,
segment_state int,
start_token varint,
token_ranges text,
PRIMARY KEY (id, segment_id)
Project board link
Hi Team ,
We have upgraded the Cassandra reaper tool from 3.2.0 to 3.6.0.
But we noticed a serious tombstone generation on multiple reaper Keyspace tables , and downgraded back to 3.2.0 , which I though of bringing up to your notice.
I'm opening the ticket first time in this community, so if any mis information or need more details let me know.
I found that this table is reading from one partition column and a non PK column with allow filtering , is anti pattern for cassandra, which will degrade casandra cluster performance & also lot of log messages pushed to kafka , which may cause network performance issues.
Once we downgraded to 3.2.0 the issue fixed. So its a bug from the new version of reaper 3.6 (I have not checked the versions between 3.2 to 3.6.)
I have tried reducing the gcgrace to 1 day & ran garbage collect , & tried to add TTL , but not helped.
Some of the tombstone queries found on the logs are listed below :
{"msg":"WARN [ReadStage-4] ReadCommand.java:536 - Read 2529 live rows and 2547 tombstone cells for query SELECT segment_state FROM cassandra_reaper.repair_run WHERE id = 52103b80-22e7-11ef-83b1-195f6f679993 AND segment_state = 2 LIMIT 5000; token 3000425142431644441 (see tombstone_warn_threshold)","pid":1780832,"fields":{"stream":"stdout"}}}
cassandra_reaper.repair_run
SELECT repair_unit_id, coordinator_host, end_token, fail_count, host_id, replicas, segment_end_time, segment_start_time, segment_state, start_token, token_ranges FROM cassandra_reaper.repair_run WHERE id = 18eddad0-193f-11ef-8f3e-eb3c8eea5a87 AND segment_state = 3 LIMIT 5000;
WARN [ReadStage-3] ReadCommand.java:536 - Read 1151 live rows and 1174 tombstone cells for query SELECT repair_unit_id, coordinator_host, end_token, fail_count, host_id, replicas, segment_end_time, segment_start_time, segment_state, start_token, token_ranges FROM cassandra_reaper.repair_run WHERE id = 18eddad0-193f-11ef-8f3e-eb3c8eea5a87 AND segment_state = 3 LIMIT 5000; token -4168918588338023632 (see tombstone_warn_threshold)
SELECT repair_unit_id, coordinator_host, end_token, fail_count, host_id, replicas, segment_end_time, segment_start_time, segment_state, start_token, token_ranges FROM cassandra_reaper.repair_run WHERE id = 18eddad0-193f-11ef-8f3e-eb3c8eea5a87 AND segment_state = 3 LIMIT 5000;
Table DDL -
CREATE TABLE cassandra_reaper.repair_run (
thanks