spotify / cassandra-reaper

Software to run automated repairs of cassandra
235 stars 60 forks source link

Reaper fails repairing 2.1 cluster #65

Open varjoranta opened 9 years ago

varjoranta commented 9 years ago

JMX signature doesn't match. https://pastebin.mozilla.org/8822185

xoraes commented 9 years ago

Been a while 2.1 has been out. Anyway this can be fixed?

varjoranta commented 9 years ago

Actually we have no current plans on fixing this just at the moment. This is mainly due to the fact that Cassandra 2.0.x is still the version that is mainly used in production. I'm sure this will start getting more focus on this after Summer.

djsly commented 9 years ago

I have updated the code to use 2.1.8 cassandra API,

The changes are quite straight forward,

         return ssProxy.forceRepairRangeAsync(beginToken.toString(), endToken.toString(), keyspace,
-                                             repairParallelism.ordinal(), null, null,
+                                             repairParallelism.ordinal(), null, null, FULL_REPAIR,
                                              columnFamilies
     }
     boolean snapshotRepair = repairParallelism.equals(RepairParallelism.SEQUENTIAL);
     return ssProxy.forceRepairRangeAsync(beginToken.toString(), endToken.toString(), keyspace,
-                                         snapshotRepair, false,
+                                         snapshotRepair, false, FULL_REPAIR,
                                          columnFamilies.toArray(new String[columnFamilies.size()]));
   }

Finally, I haven't investigated more BUT the SimpleCondition Class in Cassandra changed a lot between 2.0 and 2.1 (see https://github.com/apache/cassandra/commit/5420b7a2296d230e7fd5bc2f41fc6472a9c8b55e)

Which causes timing issues in the SegmentRunner class (Jmx responses are blocked until the await() returns)

Therefore I copied to 2.0 SimpleCondition from cassandra and added to the reaper project as a quick hack.

+import com.spotify.reaper.utils.SimpleCondition;

 import org.apache.cassandra.repair.RepairParallelism;
 import org.apache.cassandra.service.ActiveRepairService;
-import org.apache.cassandra.utils.SimpleCondition;

Thats all there is really, I will look into making the code change to support CLI and web ui update eventually.

djsly commented 9 years ago

Here's a PR https://github.com/spotify/cassandra-reaper/pull/121 for this feature.

It would be great if we could keep multiple branches, master = Cassandra 2.2, branches cassandra-2.1, branches cassandra-2.0, since the Cassandra API doesn't seem to be backward compatible all the time.

pdehlke commented 9 years ago

Upvoting @djsly. The fix for this seems pretty straightforward, and having 2.0, 2.1, 2.2 branches (in the same fashion as Neflix/Priam, for example) would seem to be much better for the community than what looks like an impending forest of forks...

zznate commented 9 years ago

@varjoranta Just sent a PR back to @djsly's Nuance fork for merge conflicts.

Please let me know if there are any other ways I can help get 2.1, 2.2, etc support in place as we've got the okay from a client to put some resources on this.

varjoranta commented 9 years ago

Thanks for being active on this! I have left Spotify a month back, so I have to synch with the guys over there, but I am planning to get the 2.1 support in Reaper soon (within few weeks). I am planning to use Reaper with C* 2.1, so I need this as well.

zznate commented 9 years ago

I'm gonna be "that guy" and ping @Yarin78 as well :)

Seriously folks, I'm happy to get involved with this wherever to keep things moving. @djsly - did you get a chance to look at that PR against your fork?

zznate commented 9 years ago

(and thanks @varjoranta for the update! good luck with the new endeavor).

djsly commented 9 years ago

@zznate Sorry for the late response, just had a chance to look at this now. I will look at your PR.

hgfischer commented 8 years ago

:100: