thelastpickle / cassandra-reaper

Automated Repair Awesomeness for Apache Cassandra
http://cassandra-reaper.io/
Apache License 2.0
490 stars 218 forks source link

If jmx is not available, reaper will continuously try to repair #1157

Open StevenLacerda opened 2 years ago

StevenLacerda commented 2 years ago

Project board link

This is a problem because you then cannot do anything with the cluster. The repair attempts continue forever it seems. In this case, we just wanted to remove the cluster, but could not because the repair was continuously attempted even though we had paused and tried to kill it.

By the way, the workaround provided to remove the cluster was to use the reaper API:

http://cassandra-reaper.io/docs/api/ https://github.com/thelastpickle/cassandra-reaper/blob/525699627b1c07792cba5d0891614556c411ebf1/src/packaging/bin/spreaper#L337

An alternative would be to restart Reaper so that the repair thread would get killed and stop trying to continuously run rerpairs.

┆Issue is synchronized with this Jira Story by Unito ┆Issue Number: REAP-98

StevenLacerda commented 2 years ago

FYI - the api did work to remove the cluster.

Rooks103 commented 2 years ago

@StevenLacerda Can you define your setup a bit more (Sidecar vs. All vs. Local, etc as well as what version)? I'd like to try and reproduce this issue to understand more.

I'm guessing you shutdown an entire cluster while a repair was running to get into this state. Is that correct?