scylladb / scylla-ccm

Cassandra Cluster Manager, modified for Scylla
Apache License 2.0
22 stars 66 forks source link

node.py: generalize expected message in watch_log_for_death #513

Closed cvybhu closed 1 year ago

cvybhu commented 1 year ago

watch_log_for_death scans node logs looking for a message that says "<nodeaddress> is now DOWN". When this message appears in the logs of other nodes we can be certain that this particular node is now DOWN.

In Scylla the messages looks like this:

127.0.0.1 is now DOWN
127.0.0.1 is now UP

But in Cassandra 4.1.3 the messages are a bit different:

127.0.0.1:7000 is now DOWN
127.0.0.1:7000 is now UP

In Cassandra the node's address also includes the port. watch_log_for_death didn't handle this properly - the regex expected an ip address and then " is now DOWN". Because of this it wasn't able to detect the message and node.stop() kept timing out for Cassandra nodes.

To fix it let's generalize the regex so that it handles both of the messages properly. The regex is now pretty much the same as that in watch_log_for_alive, which looks for is now UP messages. It's located a few lines below watch_for_log_death.