scylladb / scylla-code-samples

Code samples for working with ScyllaDB
Apache License 2.0
237 stars 130 forks source link

scylladb cluster is not working with docker swarm at all (even by following step by step of their official page) #211

Open alidehghan opened 8 months ago

alidehghan commented 8 months ago

I want to start a 3-node cluster with all default configurations and steps you described at https://university.scylladb.com/setup-a-scylla-cluster/, but cluster failed to start. Log file is attached

_scylla-node1_logs (1).txt

tzach commented 8 months ago

@guy9 FYI

benipeled commented 8 months ago
ERROR 2024-01-09 16:07:57,618 [shard 0:main] init - Startup failed: bad_configuration_error (std::exception)
2024-01-09 16:07:57,701 INFO exited: scylla (exit status 1; not expected)
Traceback (most recent call last):
  File "/opt/scylladb/scripts/libexec/scylla-housekeeping", line 196, in <module>
    args.func(args)
  File "/opt/scylladb/scripts/libexec/scylla-housekeeping", line 122, in check_version
    current_version = sanitize_version(get_api('/storage_service/scylla_release_version'))
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/scylladb/scripts/libexec/scylla-housekeeping", line 80, in get_api
    return get_json_from_url("http://" + api_address + path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/scylladb/scripts/libexec/scylla-housekeeping", line 75, in get_json_from_url
    raise RuntimeError(f'Failed to get "{path}" due to the following error: {retval}')
RuntimeError: Failed to get "http://localhost:10000/storage_service/scylla_release_version" due to the following error: <urlopen error [Errno 99] Cannot assign requested address>

@amnonh ^^ - according to https://stackoverflow.com/a/59120631 we might want to use 127.0.0.1 instead of localhost I'm not familiar with docker-swarm, maybe network-adjustment is require to allow localhost:10000

benipeled commented 8 months ago

The log also yelling about the following but I'm not sure if it's a real issue or follow-up to the startup failure

ERROR 2024-01-09 16:07:55,214 [shard 0:main] init - Bad configuration: consistent_cluster_management requires schema commit log to be enabled
amnonh commented 8 months ago

I don't know anything about Swarm, but could we have two different errors? Scylla fail to start for what ever reason, scylla house keeping couldn't connect to Scylla (which couldn't start) but that his normal if there is no scylla running.

I don't think that an error with Scylla housekeeping should have an effect on Scylla