Closed emerkle826 closed 3 years ago
Actually, I'm guessing it hasn't been updated yet because the build here isn't stable: https://doi-janky.infosiftr.net/job/update.sh/job/cassandra/
Ouch - looks like the problem is that the initialization which previously was able to complete successfully within 20 seconds is now taking significantly longer (at least twice as much time, and from the logs it appears to know it's only supposed to gossip with itself and then still waits at least 30 seconds before it gives up on other gossip nodes). Running just that one test with an adjusted retry.sh
to give the startup more time took over a full minute on my local (quite fast CPU + NVMe disk) host.
The longer startup time is known and intentional in 4.0-beta4. tl;dr there are some new setting defaults that will cause the server to allocate tokens, waiting for a fixed interval before doing so. https://issues.apache.org/jira/browse/CASSANDRA-13701
This can be bypassed in a couple of ways:
allocate_tokens_for_local_replication_factor
in cassandra.yaml
, falling back on random token generation (not as effective at balancing with the new lower num_tokens
)The delay can be shorted by defining a system property cassandra.ring_delay_ms
.
https://github.com/apache/cassandra/blob/5e8f7f591dfec5a61d8eb2e9e977ec29f3a2bbe4/src/java/org/apache/cassandra/service/StorageService.java#L152
However, any of the mitigation techniques above have implications. Maybe the best solution will be to relax the startup timeout here, and rely on the new default settings.
Thank you for the additional context, that's massively helpful! :heart:
In this case, it's a test for just some very minimal basics of a functioning single-server instance, so I've opted to adjust cassandra.ring_delay_ms
just for the test in https://github.com/docker-library/official-images/pull/9491 (given we don't want to adjust any of the defaults in the image :sweat_smile:).
That test fix was merged and the updated images are published 👍
Cassandra 4.0-beta4 was released on December 31, 2020. Not sure what the process is to get the image updated on DockerHub.