scylladb / scylla-ccm

Cassandra Cluster Manager, modified for Scylla
Apache License 2.0
20 stars 60 forks source link

scylla_cluster, scylla_node: watch_log_for_alive before watch_rest_for_alive #564

Closed bhalevy closed 3 months ago

bhalevy commented 3 months ago

https://github.com/scylladb/scylla-ccm/pull/462 introduced watch_rest_for_alive that replaced the calls to watch_log_for_alive on the scylla node(s) start path.

But a node is killed and then restarted, and other nodes miss the kill event, watch_rest_for_alive will consider that node already as up as seen by the other nodes, while previously, watch_log_for_alive, waited until other nodes discovered this node as up again, based on markes taken right before (re)starting that node.

This change brings this call back.

Fixes scylladb/scylla-ccm#563

bhalevy commented 3 months ago
nyh commented 3 months ago

Oops, @fruch @roydahan it appears I was given commit permissions in the dtest repository, but not in the ccm repository. So someone else will need to merge this PR. I approved it.

fruch commented 3 months ago

I kind of agree with @nyh, that I don't really like it. and we should try to shy away from relying on the logs for this, (we are keep being accused by @kosta and other, why dtest and ccm depends on the logs, while this kind of yet again proved we don't have better alternatives, and no one agree even what they are)

anyhow this change affects all of the test, and we should do a sweep with gating for it before merging

fruch commented 3 months ago

we'll wait for the results:

kosta commented 3 months ago

we are keep being accused by @kosta

Not by me, do mean another Kosta?

bhalevy commented 3 months ago

we are keep being accused by @kosta

Not by me, do mean another Kosta?

@kostja :)

fruch commented 3 months ago

we are keep being accused by @kosta

Not by me, do mean another Kosta?

Sorry, now you have something to accuse me for, too.

fruch commented 3 months ago

we'll wait for the results:

Runs were green, merging it in

kostja commented 3 months ago

accuse or not, scylla-ccm should retire just like the java based nodetool which thanks to @denesb bit the dust. sorry.