hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.37k stars 4.42k forks source link

Request for replication agents on server nodes #21198

Closed madson7 closed 5 months ago

madson7 commented 5 months ago

When registering a service, an error occurs and it is immediately unregistered.

cat << EOF | curl --request PUT --data @- http://127.0.0.1:8500/v1/catalog/register
{
  "Node": "server-01",
  "Address": "10.0.3.14",
  "Service": {
    "Service": "ANAC",
    "Tags": ["devops"],
    "Address": "10.0.3.14",
    "Port": 8300
  }
}
EOF

Feature Description

Request for replication agents on server nodes, made it possible to scale the interpoint dynamically in the swarm

Follow the example

image

Use Case(s)

I'm trying to use the agent with the ui_config enabled true setting to be able to register the services. What I hope is that these services are registered on all server nodes that have data persistence, unlike the agents that I can scale in swarm without worries.

Docker Swarm

madson7 commented 5 months ago
networks:
  traefik-ingress:
    external: true
    name: traefik-ingress
services:
  agent:
    command:
    - /bin/sh
    - -ec
    - "apk add --update bind-tools\nsleep 20s\n\nIP_ARRAY=\"\"\nfor line in $$(dig\
      \ +short tasks.consul_server-01); do\n    if [ -z \"$$IP_ARRAY\" ]; then\n \
      \       IP_ARRAY=\"\\\"$$line\\\"\"\n    else\n        IP_ARRAY=\"$$IP_ARRAY,\
      \ \\\"$$line\\\"\"\n    fi\ndone\nfor line in $$(dig +short tasks.consul_server-02);\
      \ do\n    if [ -z \"$$IP_ARRAY\" ]; then\n        IP_ARRAY=\"\\\"$$line\\\"\"\
      \n    else\n        IP_ARRAY=\"$$IP_ARRAY, \\\"$$line\\\"\"\n    fi\ndone\n\
      for line in $$(dig +short tasks.consul_server-03); do\n    if [ -z \"$$IP_ARRAY\"\
      \ ]; then\n        IP_ARRAY=\"\\\"$$line\\\"\"\n    else\n        IP_ARRAY=\"\
      $$IP_ARRAY, \\\"$$line\\\"\"\n    fi\ndone\n\ncat << EOF > /consul/config/server.json\n\
      {\n  \"node_name\": \"$$(hostname)\",\n  \"ui_config\": {\n    \"enabled\":\
      \ true\n  },\n  \"datacenter\":\"$${DATACENTER}\",\n  \"data_dir\": \"/consul/data\"\
      ,\n  \"addresses\": {\n    \"http\": \"0.0.0.0\"\n  },\n  \"retry_join\": [$$IP_ARRAY],\n\
      \  \"encrypt\": \"$$ENCRYPT\",\n  \"verify_incoming\": false,\n  \"verify_outgoing\"\
      : false,\n  \"verify_server_hostname\": false\n}\nEOF\n/bin/consul agent -bind=\"\
      {{ GetInterfaceIP \\\"eth0\\\" }}\" -config-dir /consul/config/server.json\n"
    depends_on:
      server-01:
        condition: service_started
      server-02:
        condition: service_started
      server-03:
        condition: service_started
    deploy:
      labels:
        traefik.constraint-label: traefik-public
        traefik.docker.network: traefik-ingress
        traefik.enable: "true"
        traefik.http.middlewares.consul-auth.basicauth.users: admin:$$apr1$$FIQc4sjm$$30fTH0ofGENEU64ntjZ6B0
        traefik.http.middlewares.https.redirectscheme.permanent: "true"
        traefik.http.middlewares.https.redirectscheme.scheme: https
        traefik.http.routers.consul-http.entrypoints: http
        traefik.http.routers.consul-http.middlewares: https
        traefik.http.routers.consul-http.rule: Host(`consul-swarm.ramos`)
        traefik.http.routers.consul-https.entrypoints: https
        traefik.http.routers.consul-https.middlewares: consul-auth
        traefik.http.routers.consul-https.rule: Host(`consul-swarm.ramos`)
        traefik.http.routers.consul-https.tls: "true"
        traefik.http.services.consul.loadbalancer.healthcheck.interval: 10s
        traefik.http.services.consul.loadbalancer.healthcheck.path: /
        traefik.http.services.consul.loadbalancer.healthcheck.timeout: 15s
        traefik.http.services.consul.loadbalancer.server.port: '8500'
      mode: replicated
      placement:
        max_replicas_per_node: 1
      replicas: 3
    environment:
      BOOTSTRAP_EXPECT: '3'
      CONSUL_HTTP_TOKEN: 32e3ed4d-93ba-44f9-a444-5a010b512528
      DATACENTER: atd
      ENCRYPT: aPuGh+5UDskRAbkLaXSzFoSOcSM+5vAK+NEYOWHJH7w=
    hostname: agent
    image: hashicorp/consul:1.18.0
    networks:
      default: {}
      traefik-ingress: {}
  server-01:
    command:
    - /bin/sh
    - -ec
    - "apk add --update bind-tools\nsleep 20s\n\nIP_ARRAY=\"\"\nfor line in $$(dig\
      \ +short tasks.consul_server-01); do\n    if [ -z \"$$IP_ARRAY\" ]; then\n \
      \       IP_ARRAY=\"\\\"$$line\\\"\"\n    else\n        IP_ARRAY=\"$$IP_ARRAY,\
      \ \\\"$$line\\\"\"\n    fi\ndone\nfor line in $$(dig +short tasks.consul_server-02);\
      \ do\n    if [ -z \"$$IP_ARRAY\" ]; then\n        IP_ARRAY=\"\\\"$$line\\\"\"\
      \n    else\n        IP_ARRAY=\"$$IP_ARRAY, \\\"$$line\\\"\"\n    fi\ndone\n\
      for line in $$(dig +short tasks.consul_server-03); do\n    if [ -z \"$$IP_ARRAY\"\
      \ ]; then\n        IP_ARRAY=\"\\\"$$line\\\"\"\n    else\n        IP_ARRAY=\"\
      $$IP_ARRAY, \\\"$$line\\\"\"\n    fi\ndone\n\ncat << EOF > /consul/config/server.json\n\
      {\n    \"node_name\": \"$$(hostname)\",\n    \"server\": true,\n    \"ui_config\"\
      : {\n        \"enabled\": false\n    },\n    \"datacenter\":\"$${DATACENTER}\"\
      ,\n    \"data_dir\": \"/consul/data\",\n    \"addresses\": {\n        \"http\"\
      : \"0.0.0.0\"\n    },\n    \"retry_join\": [$$IP_ARRAY],\n    \"encrypt\": \"\
      $$ENCRYPT\",\n    \"verify_incoming\": false,\n    \"verify_outgoing\": false,\n\
      \    \"verify_server_hostname\": false\n}\nEOF\n\n/bin/consul agent -bootstrap-expect=$${BOOTSTRAP_EXPECT}\
      \ -bind=\"{{ GetInterfaceIP \\\"eth0\\\" }}\" -config-dir /consul/config/server.json\n"
    deploy:
      mode: replicated
      placement:
        constraints:
        - node.labels.consul.server-01 == true
        max_replicas_per_node: 1
      replicas: 1
    environment:
      BOOTSTRAP_EXPECT: '3'
      CONSUL_HTTP_TOKEN: 32e3ed4d-93ba-44f9-a444-5a010b512528
      DATACENTER: atd
      ENCRYPT: aPuGh+5UDskRAbkLaXSzFoSOcSM+5vAK+NEYOWHJH7w=
    hostname: server-01
    image: hashicorp/consul:1.18.0
    networks:
      default: {}
    volumes:
    - consul_data-01:/consul/data:rw
  server-02:
    command:
    - /bin/sh
    - -ec
    - "apk add --update bind-tools\nsleep 20s\n\nIP_ARRAY=\"\"\nfor line in $$(dig\
      \ +short tasks.consul_server-01); do\n    if [ -z \"$$IP_ARRAY\" ]; then\n \
      \       IP_ARRAY=\"\\\"$$line\\\"\"\n    else\n        IP_ARRAY=\"$$IP_ARRAY,\
      \ \\\"$$line\\\"\"\n    fi\ndone\nfor line in $$(dig +short tasks.consul_server-02);\
      \ do\n    if [ -z \"$$IP_ARRAY\" ]; then\n        IP_ARRAY=\"\\\"$$line\\\"\"\
      \n    else\n        IP_ARRAY=\"$$IP_ARRAY, \\\"$$line\\\"\"\n    fi\ndone\n\
      for line in $$(dig +short tasks.consul_server-03); do\n    if [ -z \"$$IP_ARRAY\"\
      \ ]; then\n        IP_ARRAY=\"\\\"$$line\\\"\"\n    else\n        IP_ARRAY=\"\
      $$IP_ARRAY, \\\"$$line\\\"\"\n    fi\ndone\n\ncat << EOF > /consul/config/server.json\n\
      {\n    \"node_name\": \"$$(hostname)\",\n    \"server\": true,\n    \"ui_config\"\
      : {\n        \"enabled\": false\n    },\n    \"datacenter\":\"$${DATACENTER}\"\
      ,\n    \"data_dir\": \"/consul/data\",\n    \"addresses\": {\n        \"http\"\
      : \"0.0.0.0\"\n    },\n    \"retry_join\": [$$IP_ARRAY],\n    \"encrypt\": \"\
      $$ENCRYPT\",\n    \"verify_incoming\": false,\n    \"verify_outgoing\": false,\n\
      \    \"verify_server_hostname\": false\n}\nEOF\n\n/bin/consul agent -bootstrap-expect=$${BOOTSTRAP_EXPECT}\
      \ -bind=\"{{ GetInterfaceIP \\\"eth0\\\" }}\" -config-dir /consul/config/server.json\n"
    deploy:
      mode: replicated
      placement:
        constraints:
        - node.labels.consul.server-02 == true
        max_replicas_per_node: 1
      replicas: 1
    environment:
      BOOTSTRAP_EXPECT: '3'
      CONSUL_HTTP_TOKEN: 32e3ed4d-93ba-44f9-a444-5a010b512528
      DATACENTER: atd
      ENCRYPT: aPuGh+5UDskRAbkLaXSzFoSOcSM+5vAK+NEYOWHJH7w=
    hostname: server-02
    image: hashicorp/consul:1.18.0
    networks:
      default: {}
    volumes:
    - consul_data-02:/consul/data:rw
  server-03:
    command:
    - /bin/sh
    - -ec
    - "apk add --update bind-tools\nsleep 20s\n\nIP_ARRAY=\"\"\nfor line in $$(dig\
      \ +short tasks.consul_server-01); do\n    if [ -z \"$$IP_ARRAY\" ]; then\n \
      \       IP_ARRAY=\"\\\"$$line\\\"\"\n    else\n        IP_ARRAY=\"$$IP_ARRAY,\
      \ \\\"$$line\\\"\"\n    fi\ndone\nfor line in $$(dig +short tasks.consul_server-02);\
      \ do\n    if [ -z \"$$IP_ARRAY\" ]; then\n        IP_ARRAY=\"\\\"$$line\\\"\"\
      \n    else\n        IP_ARRAY=\"$$IP_ARRAY, \\\"$$line\\\"\"\n    fi\ndone\n\
      for line in $$(dig +short tasks.consul_server-03); do\n    if [ -z \"$$IP_ARRAY\"\
      \ ]; then\n        IP_ARRAY=\"\\\"$$line\\\"\"\n    else\n        IP_ARRAY=\"\
      $$IP_ARRAY, \\\"$$line\\\"\"\n    fi\ndone\n\ncat << EOF > /consul/config/server.json\n\
      {\n    \"node_name\": \"$$(hostname)\",\n    \"server\": true,\n    \"ui_config\"\
      : {\n        \"enabled\": false\n    },\n    \"datacenter\":\"$${DATACENTER}\"\
      ,\n    \"data_dir\": \"/consul/data\",\n    \"addresses\": {\n        \"http\"\
      : \"0.0.0.0\"\n    },\n    \"retry_join\": [$$IP_ARRAY],\n    \"encrypt\": \"\
      $$ENCRYPT\",\n    \"verify_incoming\": false,\n    \"verify_outgoing\": false,\n\
      \    \"verify_server_hostname\": false\n}\nEOF\n\n/bin/consul agent -bootstrap-expect=$${BOOTSTRAP_EXPECT}\
      \ -bind=\"{{ GetInterfaceIP \\\"eth0\\\" }}\" -config-dir /consul/config/server.json\n"
    deploy:
      mode: replicated
      placement:
        constraints:
        - node.labels.consul.server-03 == true
        max_replicas_per_node: 1
      replicas: 1
    environment:
      BOOTSTRAP_EXPECT: '3'
      CONSUL_HTTP_TOKEN: 32e3ed4d-93ba-44f9-a444-5a010b512528
      DATACENTER: atd
      ENCRYPT: aPuGh+5UDskRAbkLaXSzFoSOcSM+5vAK+NEYOWHJH7w=
    hostname: server-03
    image: hashicorp/consul:1.18.0
    networks:
      default: {}
    volumes:
    - consul_data-03:/consul/data:rw
version: '3.9'
volumes:
  consul_data-01:
    driver: glusterfs:latest
    name: swarm/volumes/consul/consul_data-01
  consul_data-02:
    driver: glusterfs:latest
    name: swarm/volumes/consul/consul_data-02
  consul_data-03:
    driver: glusterfs:latest
    name: swarm/volumes/consul/consul_data-03
blake commented 5 months ago

Hi @madson7, The service registration is being removed due to Consul's anti-entropy function.

Consul has a clear separation between the global service catalog and the agent's local state as discussed above. The anti-entropy mechanism reconciles these two views of the world: anti-entropy is a synchronization of the local agent state and the catalog. … During this synchronization, the catalog is also checked for correctness. If any services or checks exist in the catalog that the agent is not aware of, they will be automatically removed to make the catalog reflect the proper set of services and health information for that agent. Consul treats the state of the agent as authoritative; if there are any differences between the agent and catalog view, the agent-local view will always be used.

You are registering a service in the catalog and associating it with an agent server-01. The service is being removed because the agent is not aware the service.

(See https://developer.hashicorp.com/consul/api-docs/agent/service for registering services directly to an agent.)

In order to register a service directly in the catalog and not have it be removed during anti-entropy sync, you will need to associate the service to a node that does not actually exist in the cluster. For example, by replacing server-01 with virtual-node-01.

$ cat << EOF | curl --request PUT --data @- http://127.0.0.1:8500/v1/catalog/register
{
  "Node": "virtual-node-01",
  "Address": "10.0.3.14",
  "Service": {
    "Service": "ANAC",
    "Tags": ["devops"],
    "Address": "10.0.3.14",
    "Port": 8300
  }
}
EOF
madson7 commented 5 months ago

hi @blake this was the solution I ended up using.