Closed russss closed 9 years ago
Could you make this an option? I think we originally decided to backup the indexes because they can take a long time to rebuild in some scenarios.
I'm not referring to Cassandra's Index.db files - by "index" in this case I mean tablesnap's .json files, which are currently uploaded six times per sstable.
Cassandra SSTables consist of six files on disk. When a new one is created, tablesnap uploads six index files as well. This is fairly pointless and increases the number of files that tablechop has to fetch and parse significantly.
On one of our machines which has been running tablesnap for a week tablechop is currently taking >12 hours to run - the majority of the time taken is fetching index files (we use LeveledCompactionStrategy which probably exacerbates this):
This patch only generates a new index file for each
Data.db
file.