vitabaks / postgresql_cluster

PostgreSQL High-Availability Cluster (based on "Patroni" and DCS "etcd" or "consul"). Automating with Ansible.
MIT License
1.29k stars 352 forks source link

Consul service structure clarification needed #640

Closed garry-t closed 3 weeks ago

garry-t commented 3 weeks ago

What is the logic here? https://github.com/vitabaks/postgresql_cluster/blob/e66f2bb19c166990a57afd566f46de941a581340/vars/main.yml#L115-L129

If you run it as is you will get replica only service for all pg nodes. Because name parameter same and overwrites previous file.

What expected result here ? You need to register all services in all nodes, but healthy will be only those where roles are match.

Thanks

vitabaks commented 3 weeks ago

Have you already tried deploying a cluster with consul?

vitabaks commented 3 weeks ago

The replica service checks the endpoint /replica in the Patroni Rest API and if it responds with 200, the server registers with Consul DNS.

garry-t commented 3 weeks ago

I'm debugging role, I have my own consul cluster so role is not completely fits to my needs. Consul role mapped to my own, but it doesnt matter. I run my consul role with your default params and in my case all pg nodes after consul installation has replica files and no master.
this is confused me. When I set like this:

consul_services:
  - name: "{{ patroni_cluster_name }}-master"
    id: "{{ patroni_cluster_name }}-master"
    tags: ['master', 'primary']
    port: "{{ pgbouncer_listen_port }}"  # or "{{ postgresql_port }}" if pgbouncer_install: false
    checks:
      - { http: "http://{{ inventory_hostname }}:{{ patroni_restapi_port }}/primary", interval: "2s" }
      - { args: ["systemctl", "status", "pgbouncer"], interval: "5s" }  # comment out this check if pgbouncer_install: false
  - name: "{{ patroni_cluster_name }}-replica"
    id: "{{ patroni_cluster_name }}-replica"
    tags: ['replica']
    port: "{{ pgbouncer_listen_port }}"
    checks:
      - { http: "http://{{ inventory_hostname }}:{{ patroni_restapi_port }}/replica?lag={{ patroni_maximum_lag_on_replica }}", interval: "2s" }
      - { args: [ "systemctl", "status", "pgbouncer" ], interval: "5s" }

I see at least in consul all registered services but in unhealthy state which is now for me ok. So question is still relevant for me shall leave your approach? Or it is a bug and we need to change it. I didn't use patroni previously so I don't know what is correct structure should be.

vitabaks commented 3 weeks ago

My automation when configuring service files uses id and not name, this is done to be able to create a service with the same name for both primary and replica, which differ only in tags, example:

You can redefine the names of the services and make them different.

garry-t commented 3 weeks ago

ok. that is make sense. Since in default consul role name uses. Closed for now.