Open LannyRipple opened 9 years ago
When implementing a solution using hadoop-sstable you'll require a mapper and reducer implementation. The stitching together of the columns comes together in the reducer where you'll have everything you need to resolve the latest data for a particular key.
So a question I have is how does hadoop-sstable deal with Cass spreading columns over multiple SSTables. When you query Cass it does the work of finding the ranges you are querying, streaming the SSTables into memtables to give you the "latest" data or deal with tombstones, and then provides the result. Are you doing a full compaction to avoid needing to look in multiple tables? (It didn't sound like it unless Priam does so during backup of your ring.)
Cheers