thelastpickle / cassandra-reaper

Automated Repair Awesomeness for Apache Cassandra
http://cassandra-reaper.io/
Apache License 2.0
485 stars 217 forks source link

Should repair be run for TWCS tables? #1298

Open prashantkalkar opened 1 year ago

prashantkalkar commented 1 year ago

Project board link

We are using Cassandra 3.11.5. Most of the table are using TWCS strategy and the data is inserted with TTLs set.

As per the recommendation from Cassandra reaper documentation. Repairs should be disabled for table using TimeWindowCompactionStrategy (TWCS).

Here is the recommendation from the documentation:

It is recommended to enable this option as repairing these tables, when they contain TTL’d data, causes overlaps between partitions across the configured time windows the sstables reside in. This leads to an increased disk usage as the older sstables are unable to be expired despite only containing TTL’s data

But the Cassandra 3.11 documentation recommends to run the repairs frequently.

This is bit confusing. What is recommendation for running repairs for tables with TWCS? Has the recommendation changes for any cassandra version.

Any clarification will help.

┆Issue is synchronized with this Jira Story by Unito

adejanovski commented 1 year ago

As a general rule, running repair is advised. It is necessary to ensure consistency, especially when you're not using (LOCAL_)QUORUM, and keeps you safe from having deleted data re-appear if you perform explicit deletes (as opposed to TTLs). When it comes to TWCS, you'll want to limit as much as possible putting old data into recent sstables. That will happen naturally to some extent, due to read repairs. When it comes to anti entropy repair, I think it's not as problematic as we thought back in the days, because these repairs (considering you're not using CDC or MVs) are streaming chunks of sstables. So the resulting SSTables will be in the same time bucket as they were in the source node. So you can only put data in older buckets if they were in an older bucket already on another node. As compaction runs, they'll be compacted with the SSTable from that same bucket. So unless I get anti entropy repairs wrong when it comes to streaming, it shouldn't be that problematic to run anti entropy repairs on TWCS tables, and I still think the aggressive expired sstable deletion should be turned on. Also, running repairs frequently is too expensive if you're using full repairs, and I don't advise using incremental on 3.11.