scylladb / scylla-ansible-roles

Ansible roles for deploying and managing Scylla, Scylla-Manager and Scylla-Monitoring
42 stars 37 forks source link

[scylla-node]: generate tokens: remove the requirement of having all cluster nodes be present in the inventory #419

Open vladzcloudius opened 13 hours ago

vladzcloudius commented 13 hours ago

HEAD: dc535978b9f5d97cf5c5d3ffa1448197d88294b5

Description

The code that generates tokens reads existing token ring information as follows

- name: Get existing tokens
  block:
    - name: Get tokens
      uri:
        url: "http://{{ scylla_api_address }}:{{ scylla_api_port }}/storage_service/tokens/{{ hostvars[item]['broadcast_address'] }}"
        method: GET
      register: _existing_tokens
      until: _existing_tokens.status == 200
      retries: 5
      delay: 1
      delegate_to: "{{ bootstrapped_node }}"
      loop: "{{ groups['scylla'] }}"

    - name: Copy tokens to tmp file
      lineinfile:
        path: "{{ tokens_file.path }}"
        line: "{{ hostvars[item.item]['broadcast_address'] }}={{ item.json | map('int') | join(',') }}"
        create: yes
      when: item.json|length > 0
      delegate_to: localhost
      loop: "{{ _existing_tokens.results }}"
  when: bootstrapped_node is defined

Both tasks above effectively require that scylla group includes all nodes from the cluster and that these nodes are accessible from where the Role is invoked.

This may become a hard to achieve requirement for multi-cloud setups where each DC is provisioned in different clouds, e.g. AWS and Azure.

If we run a Role for the inventory/group that includes only nodes from a single DC we don't need to know anything about other DCs' nodes except for their tokens in the context of the code above.

In order to get the list of all nodes' broadcast_addresses without actually having access to them but with an alive node that has already joined the cluster one can use the following REST API:

Once we get the list of all nodes in the cluster the rest of the code above can remain the same - we would only iterate on that list instead of the groups['scylla'] and use items from that list instead of hostvars[item.item]['broadcast_address'].

vladzcloudius commented 13 hours ago

cc @tarzanek