portainer / agent

The Portainer agent
https://www.portainer.io
zlib License
319 stars 71 forks source link

portainer/agent 1.5.1 Fails #107

Open alphaDev23 opened 4 years ago

alphaDev23 commented 4 years ago

Below is the stack file and full logs. Using portainer:1.23.0 as the server. Has the agent been tested?

cat agent-stack.yml

version: '3.2'

services:
  agent:
    image: portainer/agent
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - /var/lib/docker/volumes:/var/lib/docker/volumes
    ports:
      - target: 9001
        published: 9001
        protocol: tcp
        mode: host
    networks:
      - portainer_agent
    deploy:
      mode: global
      placement:
        constraints: [node.platform.os == linux]

networks:
  portainer_agent:
    driver: overlay
    attachable: true
# docker logs portainer-agent_agent.wm6dl7l5f6okklhpc618cjx42.ecul5keswrg0nnzbyvs0xjvtx
2019/12/27 04:20:54 [INFO] [main] [message: Agent running on a Swarm cluster node. Running in cluster mode]
2019/12/27 04:20:57 [INFO] serf: EventMemberJoin: omicron-16-cluster-xbnit5ov34pq-primary-master-0.novalocal-b9b6107c394a 10.0.0.8
2019/12/27 04:20:58 [INFO] [http] [server_addr: 0.0.0.0] [server_port: 9001] [secured: true] [api_version: 1.5.1] [message: Starting Agent API server]
2019/12/27 04:21:15 [INFO] serf: EventMemberJoin: omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b 10.0.0.6
2019/12/27 04:21:15 [INFO] serf: EventMemberJoin: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a 10.0.0.7
2019/12/27 04:26:15 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a has failed, no acks received
2019/12/27 04:26:53 [ERR] memberlist: Failed fallback ping: read tcp 10.0.0.8:48302->10.0.0.6:7946: i/o timeout
2019/12/27 04:26:53 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b has failed, no acks received
2019/12/27 04:26:55 [ERR] memberlist: Failed fallback ping: read tcp 10.0.0.8:48306->10.0.0.6:7946: i/o timeout
2019/12/27 04:26:55 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b has failed, no acks received
2019/12/27 04:26:56 [INFO] memberlist: Marking omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b as failed, suspect timeout reached (0 peer confirmations)
2019/12/27 04:26:56 [INFO] serf: EventMemberFailed: omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b 10.0.0.6
2019/12/27 04:26:59 [WARN] memberlist: Refuting a suspect message (from: omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b)
2019/12/27 04:26:59 [INFO] serf: EventMemberJoin: omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b 10.0.0.6
2019/12/27 04:27:06 [ERR] memberlist: Failed fallback ping: read tcp 10.0.0.8:48336->10.0.0.6:7946: i/o timeout
2019/12/27 04:27:06 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b has failed, no acks received
2019/12/27 04:27:08 [ERR] memberlist: Failed fallback ping: read tcp 10.0.0.8:48340->10.0.0.6:7946: i/o timeout
2019/12/27 04:27:08 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b has failed, no acks received
2019/12/27 04:36:11 [WARN] memberlist: Refuting a suspect message (from: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a)
2019/12/27 04:36:11 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a has failed, no acks received
2019/12/27 04:36:14 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a has failed, no acks received
2019/12/27 04:36:15 [INFO] memberlist: Marking omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a as failed, suspect timeout reached (0 peer confirmations)
2019/12/27 04:36:15 [INFO] serf: EventMemberFailed: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a 10.0.0.7
2019/12/27 04:36:17 [WARN] memberlist: Was able to connect to omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a but other probes failed, network may be misconfigured
2019/12/27 04:36:17 [INFO] serf: attempting reconnect to omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a 10.0.0.7:7946
2019/12/27 04:36:17 [INFO] serf: EventMemberJoin: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a 10.0.0.7
2019/12/27 04:39:16 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a has failed, no acks received
2019/12/27 04:41:20 [WARN] memberlist: Refuting a suspect message (from: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a)
2019/12/27 04:46:15 [WARN] memberlist: Refuting a suspect message (from: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a)
2019/12/27 04:46:18 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a has failed, no acks received
2019/12/27 04:46:19 [INFO] serf: EventMemberFailed: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a 10.0.0.7
2019/12/27 04:46:20 [INFO] serf: attempting reconnect to omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a 10.0.0.7:7946
2019/12/27 04:46:37 [INFO] serf: EventMemberJoin: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a 10.0.0.7
2019/12/27 04:50:57 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a has failed, no acks received
2019/12/27 04:50:58 [WARN] memberlist: Refuting a suspect message (from: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a)
2019/12/27 04:50:59 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a has failed, no acks received
2019/12/27 04:51:02 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a has failed, no acks received
2019/12/27 04:51:12 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b has failed, no acks received
2019/12/27 04:57:07 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a has failed, no acks received
2019/12/27 04:57:10 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a has failed, no acks received
2019/12/27 04:57:11 [INFO] memberlist: Marking omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a as failed, suspect timeout reached (0 peer confirmations)
2019/12/27 04:57:11 [INFO] serf: EventMemberFailed: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a 10.0.0.7
2019/12/27 04:57:12 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a has failed, no acks received
2019/12/27 04:57:28 [WARN] memberlist: Refuting a suspect message (from: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a)
2019/12/27 04:57:46 [INFO] serf: EventMemberJoin: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a 10.0.0.7
2019/12/27 05:04:08 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b has failed, no acks received
2019/12/27 05:04:10 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b has failed, no acks received
2019/12/27 05:04:12 [INFO] memberlist: Marking omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b as failed, suspect timeout reached (0 peer confirmations)
2019/12/27 05:04:12 [INFO] serf: EventMemberFailed: omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b 10.0.0.6
2019/12/27 05:04:17 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a has failed, no acks received
2019/12/27 05:04:17 [INFO] serf: EventMemberJoin: omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b 10.0.0.6
2019/12/27 05:04:17 [WARN] memberlist: Refuting a suspect message (from: omicron-16-cluster-xbnit5ov34pq-primary-master-0.novalocal-b9b6107c394a)
2019/12/27 05:06:25 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a has failed, no acks received
2019/12/27 05:06:28 [INFO] memberlist: Marking omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a as failed, suspect timeout reached (0 peer confirmations)
2019/12/27 05:06:28 [INFO] serf: EventMemberFailed: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a 10.0.0.7
2019/12/27 05:06:28 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a has failed, no acks received
2019/12/27 05:06:30 [INFO] serf: attempting reconnect to omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a 10.0.0.7:7946
2019/12/27 05:06:50 [INFO] serf: attempting reconnect to omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a 10.0.0.7:7946
2019/12/27 05:07:10 [INFO] serf: attempting reconnect to omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a 10.0.0.7:7946
2019/12/27 05:07:42 [INFO] serf: EventMemberReap: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-8fdcd019d42a
2019/12/27 05:18:43 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b has failed, no acks received
2019/12/27 05:18:46 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b has failed, no acks received
2019/12/27 05:18:47 [INFO] memberlist: Marking omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b as failed, suspect timeout reached (0 peer confirmations)
2019/12/27 05:18:47 [INFO] serf: EventMemberFailed: omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b 10.0.0.6
2019/12/27 05:18:49 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b has failed, no acks received
2019/12/27 05:18:50 [INFO] serf: attempting reconnect to omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b 10.0.0.6:7946
2019/12/27 05:18:50 [INFO] serf: EventMemberJoin: omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b 10.0.0.6
2019/12/27 05:25:07 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b has failed, no acks received
2019/12/27 05:25:09 [WARN] memberlist: Refuting a suspect message (from: omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b)
2019/12/27 05:26:54 [INFO] serf: EventMemberJoin: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-f830ebec2622 10.0.0.11
2019/12/27 05:28:06 [WARN] memberlist: Refuting a suspect message (from: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-f830ebec2622)
2019/12/27 05:28:06 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-f830ebec2622 has failed, no acks received
2019/12/27 05:28:11 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-f830ebec2622 has failed, no acks received
2019/12/27 05:28:13 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-f830ebec2622 has failed, no acks received
2019/12/27 05:28:15 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-f830ebec2622 has failed, no acks received
2019/12/27 05:28:15 [INFO] memberlist: Marking omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-f830ebec2622 as failed, suspect timeout reached (0 peer confirmations)
2019/12/27 05:28:15 [INFO] serf: EventMemberFailed: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-f830ebec2622 10.0.0.11
2019/12/27 05:28:17 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-f830ebec2622 has failed, no acks received
2019/12/27 05:28:32 [INFO] serf: attempting reconnect to omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-f830ebec2622 10.0.0.11:7946
2019/12/27 05:28:52 [INFO] serf: attempting reconnect to omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-f830ebec2622 10.0.0.11:7946
2019/12/27 05:29:27 [INFO] serf: EventMemberReap: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-f830ebec2622
2019/12/27 05:39:01 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b has failed, no acks received
2019/12/27 05:44:29 [WARN] memberlist: Refuting a suspect message (from: omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b)
2019/12/27 05:48:54 [WARN] memberlist: Refuting a suspect message (from: omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b)
2019/12/27 05:48:54 [ERR] memberlist: Failed fallback ping: read tcp 10.0.0.8:32950->10.0.0.6:7946: i/o timeout
2019/12/27 05:48:54 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b has failed, no acks received
2019/12/27 05:48:57 [ERR] memberlist: Failed fallback ping: read tcp 10.0.0.8:32956->10.0.0.6:7946: i/o timeout
2019/12/27 05:48:57 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b has failed, no acks received
2019/12/27 05:48:58 [INFO] memberlist: Marking omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b as failed, suspect timeout reached (0 peer confirmations)
2019/12/27 05:48:58 [INFO] serf: EventMemberFailed: omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b 10.0.0.6
2019/12/27 05:49:02 [INFO] serf: attempting reconnect to omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b 10.0.0.6:7946
2019/12/27 05:49:02 [ERR] memberlist: Failed fallback ping: read tcp 10.0.0.8:32976->10.0.0.6:7946: i/o timeout
2019/12/27 05:49:02 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b has failed, no acks received
2019/12/27 05:49:11 [WARN] memberlist: Refuting a dead message (from: omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b)
2019/12/27 05:49:11 [INFO] serf: EventMemberJoin: omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b 10.0.0.6
2019/12/27 05:50:21 http error: Unable to proxy the request via the Docker socket (err=context canceled) (code=500)
2019/12/27 05:53:35 [WARN] memberlist: Refuting a suspect message (from: omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b)
2019/12/27 05:55:33 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b has failed, no acks received
2019/12/27 05:55:33 [WARN] memberlist: Refuting a suspect message (from: omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b)
2019/12/27 05:59:31 [ERR] memberlist: Failed fallback ping: read tcp 10.0.0.8:35796->10.0.0.6:7946: i/o timeout
2019/12/27 05:59:31 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b has failed, no acks received
2019/12/27 05:59:33 [ERR] memberlist: Failed fallback ping: read tcp 10.0.0.8:35802->10.0.0.6:7946: i/o timeout
2019/12/27 05:59:33 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b has failed, no acks received
2019/12/27 05:59:35 [INFO] memberlist: Marking omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b as failed, suspect timeout reached (0 peer confirmations)
2019/12/27 05:59:35 [INFO] serf: EventMemberFailed: omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b 10.0.0.6
2019/12/27 05:59:37 [ERR] memberlist: Failed fallback ping: read tcp 10.0.0.8:35816->10.0.0.6:7946: i/o timeout
2019/12/27 05:59:37 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b has failed, no acks received
2019/12/27 05:59:43 [INFO] serf: attempting reconnect to omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b 10.0.0.6:7946
2019/12/27 05:59:48 [INFO] serf: EventMemberJoin: omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b 10.0.0.6
2019/12/27 06:08:46 [ERR] memberlist: Failed fallback ping: read tcp 10.0.0.8:38232->10.0.0.6:7946: i/o timeout
2019/12/27 06:08:46 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b has failed, no acks received
2019/12/27 06:08:49 [ERR] memberlist: Failed fallback ping: read tcp 10.0.0.8:38242->10.0.0.6:7946: i/o timeout
2019/12/27 06:08:49 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b has failed, no acks received
2019/12/27 06:08:50 [INFO] memberlist: Marking omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b as failed, suspect timeout reached (0 peer confirmations)
2019/12/27 06:08:50 [INFO] serf: EventMemberFailed: omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b 10.0.0.6
2019/12/27 06:08:51 [INFO] serf: attempting reconnect to omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b 10.0.0.6:7946
2019/12/27 06:08:53 [ERR] memberlist: Failed fallback ping: read tcp 10.0.0.8:38258->10.0.0.6:7946: i/o timeout
2019/12/27 06:08:53 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b has failed, no acks received
2019/12/27 06:09:11 [INFO] serf: attempting reconnect to omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b 10.0.0.6:7946
2019/12/27 06:09:13 [INFO] serf: EventMemberJoin: omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b 10.0.0.6
2019/12/27 06:15:40 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b has failed, no acks received
2019/12/27 06:15:43 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b has failed, no acks received
2019/12/27 06:15:44 [INFO] memberlist: Marking omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b as failed, suspect timeout reached (0 peer confirmations)
2019/12/27 06:15:44 [INFO] serf: EventMemberFailed: omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b 10.0.0.6
2019/12/27 06:15:47 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b has failed, no acks received
2019/12/27 06:15:54 [INFO] serf: attempting reconnect to omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b 10.0.0.6:7946
2019/12/27 06:15:58 [INFO] serf: EventMemberJoin: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad 10.0.0.16
2019/12/27 06:16:14 [INFO] serf: attempting reconnect to omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b 10.0.0.6:7946
2019/12/27 06:16:44 [INFO] serf: attempting reconnect to omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b 10.0.0.6:7946
2019/12/27 06:16:49 [INFO] serf: EventMemberReap: omicron-16-cluster-xbnit5ov34pq-node-1.novalocal-64605377c08b
2019/12/27 06:18:56 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad has failed, no acks received
2019/12/27 06:22:52 [WARN] memberlist: Refuting a suspect message (from: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad)
2019/12/27 06:23:02 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad has failed, no acks received
2019/12/27 06:23:05 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad has failed, no acks received
2019/12/27 06:23:06 [INFO] memberlist: Marking omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad as failed, suspect timeout reached (0 peer confirmations)
2019/12/27 06:23:06 [INFO] serf: EventMemberFailed: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad 10.0.0.16
2019/12/27 06:23:23 [WARN] memberlist: Refuting a suspect message (from: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad)
2019/12/27 06:23:23 [ERR] memberlist: Failed fallback ping: write tcp 10.0.0.8:50522->10.0.0.16:7946: i/o timeout
2019/12/27 06:23:23 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad has failed, no acks received
2019/12/27 06:23:23 [INFO] serf: attempting reconnect to omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad 10.0.0.16:7946
2019/12/27 06:23:23 [INFO] serf: EventMemberJoin: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad 10.0.0.16
2019/12/27 06:24:35 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad has failed, no acks received
2019/12/27 06:24:37 [WARN] memberlist: Refuting a suspect message (from: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad)
2019/12/27 06:24:38 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad has failed, no acks received
2019/12/27 06:24:39 [INFO] memberlist: Marking omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad as failed, suspect timeout reached (0 peer confirmations)
2019/12/27 06:24:39 [INFO] serf: EventMemberFailed: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad 10.0.0.16
2019/12/27 06:24:40 [INFO] serf: EventMemberJoin: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad 10.0.0.16
2019/12/27 06:24:43 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad has failed, no acks received
2019/12/27 06:24:45 [WARN] memberlist: Refuting a suspect message (from: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad)
2019/12/27 06:24:49 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad has failed, no acks received
2019/12/27 06:24:53 [INFO] memberlist: Marking omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad as failed, suspect timeout reached (0 peer confirmations)
2019/12/27 06:24:53 [INFO] serf: EventMemberFailed: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad 10.0.0.16
2019/12/27 06:24:55 [INFO] serf: EventMemberJoin: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad 10.0.0.16
2019/12/27 06:31:46 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad has failed, no acks received
2019/12/27 06:31:48 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad has failed, no acks received
2019/12/27 06:31:50 [INFO] memberlist: Marking omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad as failed, suspect timeout reached (0 peer confirmations)
2019/12/27 06:31:50 [INFO] serf: EventMemberFailed: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad 10.0.0.16
2019/12/27 06:31:50 [INFO] serf: EventMemberJoin: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad 10.0.0.16
2019/12/27 06:31:53 [WARN] memberlist: Refuting a suspect message (from: omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad)
2019/12/27 06:31:54 [INFO] memberlist: Suspect omicron-16-cluster-xbnit5ov34pq-node-0.novalocal-202935ee55ad has failed, no acks received
deviantony commented 4 years ago

Hi @alphaDev23

Yes, it has been tested. Sounds like a network instability issue in your environment? Has the Swarm been setup with a specific configuration?

We've found out that using the --advertise-ip during the swarm init and swarm join steps when creating the steps led to more stability.

Could also be related to https://github.com/portainer/agent/pull/102 (in some high latency environments) that we're currently reviewing. That PR might help.

alphaDev23 commented 4 years ago

Swarm was set up using Openstack Magnum. 1.5.0 appears to be more stable. Is there a difference between 1.5.0 and 1.5.1 that accounts for network stability?

Separately, although it may be related, you mentioned that the agent was planning on moving to ingress ports but I see that the stack file defines host ports. Was there a reason that the later was chosen?

deviantony commented 4 years ago

The only difference between 1.5.0 and 1.5.1 is the following bugfix: https://github.com/portainer/agent/issues/95

It only implies a change regarding the detection of the agent IP address at startup.

The agent now support ingress ports but we did not officially determined which mode is recommended yet. As such, we kept the old agent definition. Although it would only solve potential issues between Portainer and the agents. In your case, it seems that there is an issue in the overlay network as agents inter-communication fails.

ping @akomelj I wonder if this is the issue you encountered before working on #102 ?

akomelj commented 4 years ago

@deviantony No, my symptoms were nothing like the ones described in #95. Agents had no problem discovering their IP addresses - probes between them were failing every few minutes due to high latency network and succeeding in between.

deviantony commented 4 years ago

Yeah I meant symptoms similar to the one reported in this issue (see logs above).

akomelj commented 4 years ago

@deviantony Huh, I guess you were asking for this issue and not #95. I skimmed through the logs above and yes - this is the exact same kind of behaviour I was observing. Failed acks and fallback pings, refuted suspects, etc.