Closed kopax closed 8 years ago
question is how does resolves master-01.mydomain.com
?
it should point to you LISTEN_IP
now
At the moment you wrote, it was resolving to $IP. I changed it to $LISTEN_IP in /etc/hosts
and the problem remain.
By the way, do you recommend setting HOSTNAME in the docker-compose.yml to $(hostname -f) or $(hostname -s) ?
Here is the new log (I kept chronos disabled)
root@master-01:~/panteras# ping master-01.domain.com
PING master-01.domain.com (172.16.0.1) 56(84) bytes of data.
64 bytes from master-01.domain.com (172.16.0.1): icmp_seq=1 ttl=64 time=0.048 ms
64 bytes from master-01.domain.com (172.16.0.1): icmp_seq=2 ttl=64 time=0.034 ms
^C
--- master-01.domain.com ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.034/0.041/0.048/0.007 ms
root@master-01:~/panteras# cat docker-compose.yml
panteras:
image: panteras/paas-in-a-box:latest
net: host
privileged: true
restart: "no"
ports:
- "8500:8500"
- "8080:8080"
- "5050:5050"
environment:
CONSUL_IP: "159.203.300.283"
HOST_IP: "159.203.300.283"
LISTEN_IP: "172.16.0.1"
FQDN: "master-01.domain.com"
GOMAXPROCS: "4"
SERVICE_8500_NAME: consul-ui
SERVICE_8500_TAGS: haproxy
SERVICE_8500_CHECK_HTTP: /v1/status/leader
SERVICE_8080_NAME: marathon
SERVICE_8080_TAGS: haproxy
SERVICE_8080_CHECK_HTTP: /v2/leader
SERVICE_5050_NAME: mesos
SERVICE_5050_TAGS: haproxy
SERVICE_5050_CHECK_HTTP: /master/health
SERVICE_4400_NAME: chronos
SERVICE_4400_TAGS: haproxy
SERVICE_4400_CHECK_HTTP: /ping
START_CONSUL: "true"
START_CONSUL_TEMPLATE: "true"
START_DNSMASQ: "true"
START_MESOS_MASTER: "true"
START_MARATHON: "true"
START_MESOS_SLAVE: "true"
START_REGISTRATOR: "true"
START_ZOOKEEPER: "true"
START_CHRONOS: "false"
CONSUL_APP_PARAMS: "agent -client=172.16.0.1 -advertise=159.203.300.283 -bind=172.16.0.1 -data-dir=/opt/consul/ -ui-dir=/opt/consul/ -node=master-01.domain.com -dc=UNKNOWN -domain consul -server -bootstrap-expect 1 "
CONSUL_DOMAIN: "consul"
CONSUL_TEMPLATE_APP_PARAMS: "-consul=159.203.300.283:8500 -template haproxy.cfg.ctmpl:/etc/haproxy/haproxy.cfg:/opt/consul-template/haproxy_reload.sh "
DNSMASQ_APP_PARAMS: "-d -u dnsmasq -r /etc/resolv.conf.orig -7 /etc/dnsmasq.d --server=/consul/159.203.300.283#8600 --host-record=master-01.domain.com,159.203.300.283 --bind-interfaces --listen-address=172.16.0.1 --address=/consul/159.203.300.283 "
HAPROXY_ADD_DOMAIN: ""
MARATHON_APP_PARAMS: "--master zk://master-01.domain.com:2181/mesos --zk zk://master-01.domain.com:2181/marathon --hostname master-01.domain.com --no-logger --http_address 172.16.0.1 --https_address 172.16.0.1 "
MESOS_MASTER_APP_PARAMS: "--zk=zk://master-01.domain.com:2181/mesos --work_dir=/var/lib/mesos --quorum=1 --ip=172.16.0.1 --hostname=master-01.domain.com --cluster=mesoscluster "
MESOS_SLAVE_APP_PARAMS: "--master=zk://master-01.domain.com:2181/mesos --containerizers=docker,mesos --executor_registration_timeout=5mins --hostname=master-01.domain.com --ip=172.16.0.1 --docker_stop_timeout=5secs --gc_delay=1days --docker_socket=/tmp/docker.sock "
REGISTRATOR_APP_PARAMS: "-ip=159.203.300.283 consul://159.203.300.283:8500 "
ZOOKEEPER_APP_PARAMS: "start-foreground"
ZOOKEEPER_HOSTS: "master-01.domain.com:2181"
ZOOKEEPER_ID: "0"
KEEPALIVED_VIP: ""
CHRONOS_APP_PARAMS: "--master zk://master-01.domain.com:2181/mesos --zk_hosts master-01.domain.com:2181 --http_address 172.16.0.1 --http_port 4400 "
HOSTNAME: "master-01.domain.com"
env_file:
./restricted/env
volumes:
- "/etc/resolv.conf:/etc/resolv.conf.orig"
- "/var/spool/marathon/artifacts/store:/var/spool/store"
- "/var/run/docker.sock:/tmp/docker.sock"
- "/var/lib/docker:/var/lib/docker"
- "/sys:/sys"
- "/tmp/mesos:/tmp/mesos"
root@master-01:~/panteras# dc up^C
root@master-01:~/panteras# dc u^C
root@master-01:~/panteras# rm -rf /tmp/mesos/*
root@master-01:~/panteras# dc up -d && dc logs
Creating panteras_panteras_1
Attaching to panteras_panteras_1
panteras_1 | 2016-01-15 12:20:23,699 CRIT Supervisor running as root (no user in config file)
panteras_1 | 2016-01-15 12:20:23,712 INFO RPC interface 'supervisor' initialized
panteras_1 | 2016-01-15 12:20:23,712 CRIT Server 'inet_http_server' running without any HTTP authentication checking
panteras_1 | 2016-01-15 12:20:23,712 INFO RPC interface 'supervisor' initialized
panteras_1 | 2016-01-15 12:20:23,713 CRIT Server 'unix_http_server' running without any HTTP authentication checking
panteras_1 | 2016-01-15 12:20:23,713 INFO supervisord started with pid 1
panteras_1 | 2016-01-15 12:20:24,716 INFO spawned: 'stdout' with pid 8
panteras_1 | 2016-01-15 12:20:24,719 INFO spawned: 'dnsmasq' with pid 9
panteras_1 | 2016-01-15 12:20:24,722 INFO spawned: 'consul' with pid 10
panteras_1 | 2016-01-15 12:20:24,724 INFO spawned: 'zookeeper' with pid 11
panteras_1 | 2016-01-15 12:20:24,727 INFO spawned: 'consul-template_haproxy' with pid 12
panteras_1 | 2016-01-15 12:20:24,730 INFO spawned: 'mesos-master' with pid 13
panteras_1 | 2016-01-15 12:20:24,733 INFO spawned: 'marathon' with pid 17
panteras_1 | 2016-01-15 12:20:24,736 INFO spawned: 'mesos-slave' with pid 19
panteras_1 | 2016-01-15 12:20:24,739 INFO spawned: 'registrator' with pid 21
panteras_1 | 2016-01-15 12:20:25,344 INFO exited: registrator (exit status 1; not expected)
panteras_1 | 2016-01-15 12:20:25,729 INFO success: stdout entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
panteras_1 | 2016-01-15 12:20:25,730 INFO success: dnsmasq entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
panteras_1 | 2016-01-15 12:20:25,730 INFO success: consul entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
panteras_1 | 2016-01-15 12:20:25,730 INFO success: zookeeper entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
panteras_1 | 2016-01-15 12:20:25,730 INFO success: consul-template_haproxy entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
panteras_1 | 2016-01-15 12:20:25,730 INFO success: mesos-master entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
panteras_1 | zookeeper stderr | JMX enabled by default
panteras_1 | 2016-01-15 12:20:25,731 INFO success: marathon entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
panteras_1 | consul-template_haproxy stdout | Currently: none
panteras_1 | consul-template_haproxy stdout | Initially routing to haproxy_a
panteras_1 | zookeeper stderr | Using config: /etc/zookeeper/conf/zoo.cfg
panteras_1 | consul stdout | ==> WARNING: BootstrapExpect Mode is specified as 1; this is the same as Bootstrap mode.
panteras_1 | ==> WARNING: Bootstrap mode enabled! Do not enable unless necessary
panteras_1 | 2016-01-15 12:20:25,735 INFO success: mesos-slave entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
panteras_1 | consul stdout | ==> Starting Consul agent...
panteras_1 | marathon stdout | run_jar
panteras_1 | dnsmasq stderr | dnsmasq: started, version 2.68 cachesize 150
panteras_1 | dnsmasq stderr | dnsmasq: compile time options: IPv6 GNU-getopt DBus i18n IDN DHCP DHCPv6 no-Lua TFTP conntrack ipset auth
panteras_1 | dnsmasq: using nameserver 159.203.300.283#8600 for domain consul
panteras_1 | dnsmasq: reading /etc/resolv.conf.orig
panteras_1 | dnsmasq: using nameserver 213.186.33.99#53
panteras_1 | dnsmasq: using nameserver 127.0.0.1#53
panteras_1 | dnsmasq: using nameserver 159.203.300.283#8600 for domain consul
panteras_1 | dnsmasq: read /etc/hosts - 10 addresses
panteras_1 | consul stdout | ==> Starting Consul agent RPC...
panteras_1 | consul stdout | ==> Consul agent running!
panteras_1 | Node name: 'master-01.domain.com'
panteras_1 | Datacenter: 'unknown'
panteras_1 | Server: true (bootstrap: true)
panteras_1 | Client Addr: 172.16.0.1 (HTTP: 8500, HTTPS: -1, DNS: 8600, RPC: 8400)
panteras_1 | Cluster Addr: 159.203.300.283 (LAN: 8301, WAN: 8302)
panteras_1 | consul stdout | Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
panteras_1 | Atlas: <disabled>
panteras_1 |
panteras_1 | ==> Log data will now stream in as it occurs:
panteras_1 |
panteras_1 | 2016/01/15 12:20:25 [INFO] raft: Node at 159.203.300.283:8300 [Follower] entering Follower state
panteras_1 | 2016/01/15 12:20:25 [WARN] memberlist: Binding to public address without encryption!
panteras_1 | 2016/01/15 12:20:25 [INFO] serf: EventMemberJoin: master-01.domain.com 159.203.300.283
panteras_1 | 2016/01/15 12:20:25 [INFO] consul: adding LAN server master-01.domain.com (Addr: 159.203.300.283:8300) (DC: unknown)
panteras_1 | 2016/01/15 12:20:25 [WARN] memberlist: Binding to public address without encryption!
panteras_1 | 2016/01/15 12:20:25 [INFO] serf: EventMemberJoin: master-01.domain.com.unknown 159.203.300.283
panteras_1 | 2016/01/15 12:20:25 [INFO] consul: adding WAN server master-01.domain.com.unknown (Addr: 159.203.300.283:8300) (DC: unknown)
panteras_1 | 2016/01/15 12:20:25 [ERR] agent: failed to sync remote state: No cluster leader
panteras_1 | registrator stderr | 2016/01/15 12:20:25 Starting registrator ...
panteras_1 | 2016/01/15 12:20:25 Forcing host IP to 159.203.300.283
panteras_1 | registrator stderr | 2016/01/15 12:20:25 consul: Get http://159.203.300.283:8500/v1/status/leader: dial tcp 159.203.300.283:8500: connection refused
panteras_1 | consul-template_haproxy stderr | 2016/01/15 12:20:25 [DEBUG] (logging) setting up logging
panteras_1 | consul-template_haproxy stderr | 2016/01/15 12:20:25 [DEBUG] (logging) config:
panteras_1 |
panteras_1 | {
panteras_1 | "name": "consul-template",
panteras_1 | "level": "WARN",
panteras_1 | "syslog": false,
panteras_1 | "syslog_facility": "LOCAL0"
panteras_1 | }
panteras_1 |
panteras_1 | consul-template_haproxy stderr | 2016/01/15 12:20:25 [ERR] (view) "services" catalog services: error fetching: Get http://159.203.300.283:8500/v1/catalog/services?wait=60000ms: dial tcp 159.203.300.283:8500: connection refused
panteras_1 | mesos-master stderr | I0115 12:20:25.365522 13 main.cpp:229] Build: 2015-10-12 20:57:28 by root
panteras_1 | I0115 12:20:25.365653 13 main.cpp:231] Version: 0.25.0
panteras_1 | I0115 12:20:25.365660 13 main.cpp:234] Git tag: 0.25.0
panteras_1 | I0115 12:20:25.365665 13 main.cpp:238] Git SHA: 2dd7f7ee115fe00b8e098b0a10762a4fa8f4600f
panteras_1 | mesos-master stderr | I0115 12:20:25.365728 13 main.cpp:252] Using 'HierarchicalDRF' allocator
panteras_1 | consul-template_haproxy stderr | 2016/01/15 12:20:25 [ERR] (runner) watcher reported error: catalog services: error fetching: Get http://159.203.300.283:8500/v1/catalog/services?wait=60000ms: dial tcp 159.203.300.283:8500: connection refused
panteras_1 | mesos-slave stderr | I0115 12:20:25.366102 19 main.cpp:185] Build: 2015-10-12 20:57:28 by root
panteras_1 | I0115 12:20:25.366231 19 main.cpp:187] Version: 0.25.0
panteras_1 | I0115 12:20:25.366240 19 main.cpp:190] Git tag: 0.25.0
panteras_1 | I0115 12:20:25.366245 19 main.cpp:194] Git SHA: 2dd7f7ee115fe00b8e098b0a10762a4fa8f4600f
panteras_1 | mesos-master stderr | I0115 12:20:25.587860 13 leveldb.cpp:176] Opened db in 221.719666ms
panteras_1 | mesos-master stderr | I0115 12:20:25.629541 13 leveldb.cpp:183] Compacted db in 41.622544ms
panteras_1 | I0115 12:20:25.629609 13 leveldb.cpp:198] Created db iterator in 23148ns
panteras_1 | mesos-master stderr | I0115 12:20:25.629626 13 leveldb.cpp:204] Seeked to beginning of db in 1244ns
panteras_1 | I0115 12:20:25.629637 13 leveldb.cpp:273] Iterated through 0 keys in the db in 2984ns
panteras_1 | I0115 12:20:25.629853 13 replica.cpp:744] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
panteras_1 | mesos-master stderr | 2016-01-15 12:20:25,631:13(0x6c24dc8d4700):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.5
panteras_1 | 2016-01-15 12:20:25,631:13(0x6c24dc8d4700):ZOO_INFO@log_env@716: Client environment:host.name=master-01.domain.com
panteras_1 | 2016-01-15 12:20:25,631:13(0x6c24dc8d4700):ZOO_INFO@log_env@723: Client environment:os.name=Linux
panteras_1 | 2016-01-15 12:20:25,631:13(0x6c24dc8d4700):ZOO_INFO@log_env@724: Client environment:os.arch=3.14.32-xxxx-grs-ipv6-64
panteras_1 | 2016-01-15 12:20:25,631:13(0x6c24dc8d4700):ZOO_INFO@log_env@725: Client environment:os.version=#5 SMP Wed Sep 9 17:24:34 CEST 2015
panteras_1 | 2016-01-15 12:20:25,631:13(0x6c24dc8d4700):ZOO_INFO@log_env@733: Client environment:user.name=(null)
panteras_1 | mesos-master stderr | 2016-01-15 12:20:25,631:13(0x6c24daeab700):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.5
panteras_1 | 2016-01-15 12:20:25,631:13(0x6c24daeab700):ZOO_INFO@log_env@716: Client environment:host.name=master-01.domain.com
panteras_1 | I0115 12:20:25.631790 119 log.cpp:238] Attempting to join replica to ZooKeeper group
panteras_1 | 2016-01-15 12:20:25,631:13(0x6c24daeab700):ZOO_INFO@log_env@723: Client environment:os.name=Linux
panteras_1 | 2016-01-15 12:20:25,631:13(0x6c24daeab700):ZOO_INFO@log_env@724: Client environment:os.arch=3.14.32-xxxx-grs-ipv6-64
panteras_1 | 2016-01-15 12:20:25,631:13(0x6c24daeab700):ZOO_INFO@log_env@725: Client environment:os.version=#5 SMP Wed Sep 9 17:24:34 CEST 2015
panteras_1 | 2016-01-15 12:20:25,631:13(0x6c24daeab700):ZOO_INFO@log_env@733: Client environment:user.name=(null)
panteras_1 | mesos-master stderr | I0115 12:20:25.632570 126 recover.cpp:449] Starting replica recovery
panteras_1 | mesos-master stderr | I0115 12:20:25.632838 126 recover.cpp:475] Replica is in EMPTY status
panteras_1 | 2016-01-15 12:20:25,633:13(0x6c24daeab700):ZOO_INFO@log_env@741: Client environment:user.home=/root
panteras_1 | 2016-01-15 12:20:25,633:13(0x6c24daeab700):ZOO_INFO@log_env@753: Client environment:user.dir=/opt
panteras_1 | 2016-01-15 12:20:25,633:13(0x6c24daeab700):ZOO_INFO@zookeeper_init@786: Initiating client connection, host=master-01.domain.com:2181 sessionTimeout=10000 watcher=0x6c24e74e3600 sessionId=0 sessionPasswd=<null> context=0x6c24b8000ba0 flags=0
panteras_1 | 2016-01-15 12:20:25,633:13(0x6c24dc8d4700):ZOO_INFO@log_env@741: Client environment:user.home=/root
panteras_1 | 2016-01-15 12:20:25,633:13(0x6c24dc8d4700):ZOO_INFO@log_env@753: Client environment:user.dir=/opt
panteras_1 | mesos-master stderr | 2016-01-15 12:20:25,633:13(0x6c24dc8d4700):ZOO_INFO@zookeeper_init@786: Initiating client connection, host=master-01.domain.com:2181 sessionTimeout=10000 watcher=0x6c24e74e3600 sessionId=0 sessionPasswd=<null> context=0x6c24ac000930 flags=0
panteras_1 | 2016-01-15 12:20:25,633:13(0x6c24db79e700):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.5
panteras_1 | 2016-01-15 12:20:25,633:13(0x6c24db79e700):ZOO_INFO@log_env@716: Client environment:host.name=master-01.domain.com
panteras_1 | 2016-01-15 12:20:25,633:13(0x6c24d9584700):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.5
panteras_1 | I0115 12:20:25.633282 13 main.cpp:465] Starting Mesos master
panteras_1 | 2016-01-15 12:20:25,633:13(0x6c24d9584700):ZOO_INFO@log_env@716: Client environment:host.name=master-01.domain.com
panteras_1 | 2016-01-15 12:20:25,633:13(0x6c24d9584700):ZOO_INFO@log_env@723: Client environment:os.name=Linux
panteras_1 | 2016-01-15 12:20:25,633:13(0x6c24d9584700):ZOO_INFO@log_env@724: Client environment:os.arch=3.14.32-xxxx-grs-ipv6-64
panteras_1 | 2016-01-15 12:20:25,633:13(0x6c24d9584700):ZOO_INFO@log_env@725: Client environment:os.version=#5 SMP Wed Sep 9 17:24:34 CEST 2015
panteras_1 | 2016-01-15 12:20:25,633:13(0x6c24db79e700):ZOO_INFO@log_env@723: Client environment:os.name=Linux
panteras_1 | 2016-01-15 12:20:25,633:13(0x6c24db79e700):ZOO_INFO@log_env@724: Client environment:os.arch=3.14.32-xxxx-grs-ipv6-64
panteras_1 | 2016-01-15 12:20:25,633:13(0x6c24db79e700):ZOO_INFO@log_env@725: Client environment:os.version=#5 SMP Wed Sep 9 17:24:34 CEST 2015
panteras_1 | 2016-01-15 12:20:25,633:13(0x6c24d9584700):ZOO_INFO@log_env@733: Client environment:user.name=(null)
panteras_1 | 2016-01-15 12:20:25,633:13(0x6c24db79e700):ZOO_INFO@log_env@733: Client environment:user.name=(null)
panteras_1 | mesos-master stderr | 2016-01-15 12:20:25,633:13(0x6c24db79e700):ZOO_INFO@log_env@741: Client environment:user.home=/root
panteras_1 | 2016-01-15 12:20:25,633:13(0x6c24db79e700):ZOO_INFO@log_env@753: Client environment:user.dir=/opt
panteras_1 | 2016-01-15 12:20:25,633:13(0x6c24db79e700):ZOO_INFO@zookeeper_init@786: Initiating client connection, host=master-01.domain.com:2181 sessionTimeout=10000 watcher=0x6c24e74e3600 sessionId=0 sessionPasswd=<null> context=0x6c2494000930 flags=0
panteras_1 | 2016-01-15 12:20:25,633:13(0x6c24d9584700):ZOO_INFO@log_env@741: Client environment:user.home=/root
panteras_1 | 2016-01-15 12:20:25,633:13(0x6c24d9584700):ZOO_INFO@log_env@753: Client environment:user.dir=/opt
panteras_1 | 2016-01-15 12:20:25,633:13(0x6c24d9584700):ZOO_INFO@zookeeper_init@786: Initiating client connection, host=master-01.domain.com:2181 sessionTimeout=10000 watcher=0x6c24e74e3600 sessionId=0 sessionPasswd=<null> context=0x6c24c8001730 flags=0
panteras_1 | mesos-master stderr | I0115 12:20:25.634519 129 replica.cpp:641] Replica in EMPTY status received a broadcasted recover request
panteras_1 | mesos-master stderr | 2016-01-15 12:20:25,634:13(0x6c24d598f700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [172.16.0.1:2181] zk retcode=-4, errno=111(Connection refused): server refused to accept the client
panteras_1 | 2016-01-15 12:20:25,634:13(0x6c248bff0700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [172.16.0.1:2181] zk retcode=-4, errno=111(Connection refused): server refused to accept the client
panteras_1 | 2016-01-15 12:20:25,634:13(0x6c24d5093700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [172.16.0.1:2181] zk retcode=-4, errno=111(Connection refused): server refused to accept the client
panteras_1 | mesos-master stderr | 2016-01-15 12:20:25,634:13(0x6c248af21700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [172.16.0.1:2181] zk retcode=-4, errno=111(Connection refused): server refused to accept the client
panteras_1 | mesos-master stderr | I0115 12:20:25.635663 133 recover.cpp:195] Received a recover response from a replica in EMPTY status
panteras_1 | mesos-master stderr | I0115 12:20:25.635978 119 recover.cpp:566] Updating replica status to STARTING
panteras_1 | mesos-master stderr | I0115 12:20:25.637027 134 master.cpp:376] Master 0b51d279-d986-4d08-a6a5-4d5b451b4fb4 (master-01.domain.com) started on 172.16.0.1:5050
panteras_1 | mesos-master stderr | I0115 12:20:25.637061 134 master.cpp:378] Flags at startup: --allocation_interval="1secs" --allocator="HierarchicalDRF" --authenticate="false" --authenticate_slaves="false" --authenticators="crammd5" --authorizers="local" --cluster="mesoscluster" --framework_sorter="drf" --help="false" --hostname="master-01.domain.com" --hostname_lookup="true" --initialize_driver_logging="true" --ip="172.16.0.1" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" --port="5050" --quiet="false" --quorum="1" --recovery_slave_removal_limit="100%" --registry="replicated_log" --registry_fetch_timeout="1mins" --registry_store_timeout="5secs" --registry_strict="false" --root_submissions="true" --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" --webui_dir="/usr/share/mesos/webui" --work_dir="/var/lib/mesos" --zk="zk://master-01.domain.com:2181/mesos" --zk_session_timeout="10secs"
panteras_1 | I0115 12:20:25.637405 134 master.cpp:425] Master allowing unauthenticated frameworks to register
panteras_1 | I0115 12:20:25.637415 134 master.cpp:430] Master allowing unauthenticated slaves to register
panteras_1 | I0115 12:20:25.637428 134 master.cpp:467] Using default 'crammd5' authenticator
panteras_1 | mesos-master stderr | W0115 12:20:25.637534 134 authenticator.cpp:505] No credentials provided, authentication requests will be refused
panteras_1 | I0115 12:20:25.637547 134 authenticator.cpp:512] Initializing server SASL
panteras_1 | mesos-master stderr | I0115 12:20:25.684892 127 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 48.755512ms
panteras_1 | I0115 12:20:25.684972 127 replica.cpp:323] Persisted replica status to STARTING
panteras_1 | mesos-master stderr | I0115 12:20:25.685070 127 recover.cpp:475] Replica is in STARTING status
panteras_1 | mesos-master stderr | I0115 12:20:25.685839 131 replica.cpp:641] Replica in STARTING status received a broadcasted recover request
panteras_1 | I0115 12:20:25.686008 133 recover.cpp:195] Received a recover response from a replica in STARTING status
panteras_1 | I0115 12:20:25.686359 125 recover.cpp:566] Updating replica status to VOTING
panteras_1 | mesos-master stderr | I0115 12:20:25.729534 128 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 43.049611ms
panteras_1 | I0115 12:20:25.729583 128 replica.cpp:323] Persisted replica status to VOTING
panteras_1 | mesos-master stderr | I0115 12:20:25.729630 128 recover.cpp:580] Successfully joined the Paxos group
panteras_1 | I0115 12:20:25.729746 128 recover.cpp:464] Recover process terminated
panteras_1 | mesos-master stderr | I0115 12:20:25.738369 134 contender.cpp:149] Joining the ZK group
panteras_1 | mesos-slave stderr | I0115 12:20:25.865885 19 containerizer.cpp:143] Using isolation: posix/cpu,posix/mem,filesystem/posix
panteras_1 | mesos-slave stderr | I0115 12:20:25.880795 19 linux_launcher.cpp:103] Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
panteras_1 | mesos-slave stderr | 2016-01-15 12:20:25,885:19(0x7575782ae700):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.5
panteras_1 | 2016-01-15 12:20:25,885:19(0x7575782ae700):ZOO_INFO@log_env@716: Client environment:host.name=master-01.domain.com
panteras_1 | 2016-01-15 12:20:25,885:19(0x7575782ae700):ZOO_INFO@log_env@723: Client environment:os.name=Linux
panteras_1 | 2016-01-15 12:20:25,885:19(0x7575782ae700):ZOO_INFO@log_env@724: Client environment:os.arch=3.14.32-xxxx-grs-ipv6-64
panteras_1 | 2016-01-15 12:20:25,885:19(0x7575782ae700):ZOO_INFO@log_env@725: Client environment:os.version=#5 SMP Wed Sep 9 17:24:34 CEST 2015
panteras_1 | I0115 12:20:25.885457 19 main.cpp:272] Starting Mesos slave
panteras_1 | 2016-01-15 12:20:25,885:19(0x7575782ae700):ZOO_INFO@log_env@733: Client environment:user.name=(null)
panteras_1 | mesos-slave stderr | 2016-01-15 12:20:25,885:19(0x7575782ae700):ZOO_INFO@log_env@741: Client environment:user.home=/root
panteras_1 | 2016-01-15 12:20:25,885:19(0x7575782ae700):ZOO_INFO@log_env@753: Client environment:user.dir=/opt
panteras_1 | 2016-01-15 12:20:25,885:19(0x7575782ae700):ZOO_INFO@zookeeper_init@786: Initiating client connection, host=master-01.domain.com:2181 sessionTimeout=10000 watcher=0x757581cd3600 sessionId=0 sessionPasswd=<null> context=0x757540000930 flags=0
panteras_1 | I0115 12:20:25.885851 141 slave.cpp:190] Slave started on 1)@172.16.0.1:5051
panteras_1 | 2016-01-15 12:20:25,885:19(0x757570d45700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [172.16.0.1:2181] zk retcode=-4, errno=111(Connection refused): server refused to accept the client
panteras_1 | I0115 12:20:25.885866 141 slave.cpp:191] Flags at startup: --appc_store_dir="/tmp/mesos/store/appc" --authenticatee="crammd5" --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" --cgroups_root="mesos" --container_disk_watch_interval="15secs" --containerizers="docker,mesos" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_kill_orphans="true" --docker_remove_delay="6hrs" --docker_socket="/tmp/docker.sock" --docker_stop_timeout="5secs" --enforce_container_disk_quota="false" --executor_registration_timeout="5mins" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/tmp/mesos/fetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1days" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname="master-01.domain.com" --hostname_lookup="true" --image_provisioner_backend="copy" --initialize_driver_logging="true" --ip="172.16.0.1" --isolation="posix/cpu,posix/mem" --launcher_dir="/usr/libexec/mesos" --logbufsecs="0" --logging_level="INFO" --master="zk://master-01.domain.com:2181/mesos" --oversubscribed_resources_interval="15secs" --perf_duration="10secs" --perf_interval="1mins" --port="5051" --qos_correction_interval_min="0ns" --quiet="false" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="1secs" --resource_monitoring_interval="1secs" --revocable_cpu_low_priority="true" --sandbox_directory="/mnt/mesos/sandbox" --strict="true" --switch_user="true" --systemd_runtime_directory="/run/systemd/system" --version="false" --work_dir="/tmp/mesos"
panteras_1 | mesos-slave stderr | I0115 12:20:25.886320 141 slave.cpp:354] Slave resources: cpus(*):16; mem(*):31104; disk(*):14436; ports(*):[31000-32000]
panteras_1 | I0115 12:20:25.886346 141 slave.cpp:390] Slave hostname: master-01.domain.com
panteras_1 | I0115 12:20:25.886350 141 slave.cpp:395] Slave checkpoint: true
panteras_1 | mesos-slave stderr | I0115 12:20:25.888867 138 state.cpp:54] Recovering state from '/tmp/mesos/meta'
panteras_1 | mesos-slave stderr | I0115 12:20:25.888994 150 status_update_manager.cpp:202] Recovering status update manager
panteras_1 | I0115 12:20:25.889259 138 docker.cpp:535] Recovering Docker containers
panteras_1 | I0115 12:20:25.889353 139 containerizer.cpp:386] Recovering containerizer
panteras_1 | mesos-slave stderr | I0115 12:20:25.892127 137 slave.cpp:4110] Finished recovery
panteras_1 | marathon stdout | [2016-01-15 12:20:26,280] INFO Starting Marathon 0.13.0 with --master zk://master-01.domain.com:2181/mesos --zk zk://master-01.domain.com:2181/marathon --hostname master-01.domain.com --http_address 172.16.0.1 --https_address 172.16.0.1 (mesosphere.marathon.Main$:main)
panteras_1 | 2016-01-15 12:20:26,671 INFO spawned: 'registrator' with pid 215
panteras_1 | consul stdout | 2016/01/15 12:20:26 [WARN] raft: Heartbeat timeout reached, starting election
panteras_1 | 2016/01/15 12:20:26 [INFO] raft: Node at 159.203.300.283:8300 [Candidate] entering Candidate state
panteras_1 | registrator stderr | 2016/01/15 12:20:26 Starting registrator ...
panteras_1 | 2016/01/15 12:20:26 Forcing host IP to 159.203.300.283
panteras_1 | registrator stderr | 2016/01/15 12:20:26 consul: Get http://159.203.300.283:8500/v1/status/leader: dial tcp 159.203.300.283:8500: connection refused
panteras_1 | 2016-01-15 12:20:26,689 INFO exited: registrator (exit status 1; not expected)
panteras_1 | consul stdout | 2016/01/15 12:20:26 [INFO] raft: Election won. Tally: 1
panteras_1 | 2016/01/15 12:20:26 [INFO] raft: Node at 159.203.300.283:8300 [Leader] entering Leader state
panteras_1 | consul stdout | 2016/01/15 12:20:26 [INFO] consul: cluster leadership acquired
panteras_1 | 2016/01/15 12:20:26 [INFO] consul: New leader elected: master-01.domain.com
panteras_1 | consul stdout | 2016/01/15 12:20:27 [INFO] raft: Disabling EnableSingleNode (bootstrap)
panteras_1 | consul stdout | 2016/01/15 12:20:27 [INFO] consul: member 'master-01.domain.com' joined, marking health alive
panteras_1 | marathon stdout | [2016-01-15 12:20:28,213] INFO Connecting to Zookeeper... (mesosphere.marathon.Main$:main)
panteras_1 | marathon stdout | [2016-01-15 12:20:28,223] INFO Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT (org.apache.zookeeper.ZooKeeper:main)
panteras_1 | [2016-01-15 12:20:28,223] INFO Client environment:host.name=master-01.domain.com (org.apache.zookeeper.ZooKeeper:main)
panteras_1 | [2016-01-15 12:20:28,223] INFO Client environment:java.version=1.8.0_66 (org.apache.zookeeper.ZooKeeper:main)
panteras_1 | marathon stdout | [2016-01-15 12:20:28,223] INFO Client environment:java.vendor=Oracle Corporation (org.apache.zookeeper.ZooKeeper:main)
panteras_1 | [2016-01-15 12:20:28,223] INFO Client environment:java.home=/usr/lib/jvm/java-8-oracle/jre (org.apache.zookeeper.ZooKeeper:main)
panteras_1 | [2016-01-15 12:20:28,223] INFO Client environment:java.class.path=/usr/bin/marathon (org.apache.zookeeper.ZooKeeper:main)
panteras_1 | [2016-01-15 12:20:28,223] INFO Client environment:java.library.path=/usr/local/lib:/usr/lib:/usr/lib64 (org.apache.zookeeper.ZooKeeper:main)
panteras_1 | [2016-01-15 12:20:28,223] INFO Client environment:java.io.tmpdir=/tmp (org.apache.zookeeper.ZooKeeper:main)
panteras_1 | [2016-01-15 12:20:28,223] INFO Client environment:java.compiler=<NA> (org.apache.zookeeper.ZooKeeper:main)
panteras_1 | [2016-01-15 12:20:28,223] INFO Client environment:os.name=Linux (org.apache.zookeeper.ZooKeeper:main)
panteras_1 | [2016-01-15 12:20:28,223] INFO Client environment:os.arch=amd64 (org.apache.zookeeper.ZooKeeper:main)
panteras_1 | [2016-01-15 12:20:28,223] INFO Client environment:os.version=3.14.32-xxxx-grs-ipv6-64 (org.apache.zookeeper.ZooKeeper:main)
panteras_1 | [2016-01-15 12:20:28,223] INFO Client environment:user.name=root (org.apache.zookeeper.ZooKeeper:main)
panteras_1 | [2016-01-15 12:20:28,223] INFO Client environment:user.home=/root (org.apache.zookeeper.ZooKeeper:main)
panteras_1 | [2016-01-15 12:20:28,223] INFO Client environment:user.dir=/opt (org.apache.zookeeper.ZooKeeper:main)
panteras_1 | marathon stdout | [2016-01-15 12:20:28,224] INFO Initiating client connection, connectString=master-01.domain.com:2181 sessionTimeout=10000 watcher=com.twitter.common.zookeeper.ZooKeeperClient$3@476aac9 (org.apache.zookeeper.ZooKeeper:main)
panteras_1 | marathon stdout | [2016-01-15 12:20:28,245] INFO Opening socket connection to server master-01.domain.com/172.16.0.1:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn:main-SendThread(master-01.domain.com:2181))
panteras_1 | marathon stdout | [2016-01-15 12:20:28,336] INFO Socket connection established to master-01.domain.com/172.16.0.1:2181, initiating session (org.apache.zookeeper.ClientCnxn:main-SendThread(master-01.domain.com:2181))
panteras_1 | marathon stdout | [2016-01-15 12:20:28,612] INFO Session establishment complete on server master-01.domain.com/172.16.0.1:2181, sessionid = 0x152453ac77c0000, negotiated timeout = 10000 (org.apache.zookeeper.ClientCnxn:main-SendThread(master-01.domain.com:2181))
panteras_1 | 2016-01-15 12:20:28,759 INFO spawned: 'registrator' with pid 224
panteras_1 | consul stdout | 2016/01/15 12:20:28 [INFO] agent: Synced service 'consul'
panteras_1 | registrator stderr | 2016/01/15 12:20:28 Starting registrator ...
panteras_1 | 2016/01/15 12:20:28 Forcing host IP to 159.203.300.283
panteras_1 | 2016-01-15 12:20:28,776 INFO exited: registrator (exit status 1; not expected)
panteras_1 | registrator stderr | 2016/01/15 12:20:28 consul: Get http://159.203.300.283:8500/v1/status/leader: dial tcp 159.203.300.283:8500: connection refused
panteras_1 | mesos-master stderr | 2016-01-15 12:20:28,969:13(0x6c248bff0700):ZOO_INFO@check_events@1703: initiated connection to server [172.16.0.1:2181]
panteras_1 | 2016-01-15 12:20:28,969:13(0x6c248af21700):ZOO_INFO@check_events@1703: initiated connection to server [172.16.0.1:2181]
panteras_1 | mesos-master stderr | 2016-01-15 12:20:28,971:13(0x6c24d598f700):ZOO_INFO@check_events@1703: initiated connection to server [172.16.0.1:2181]
panteras_1 | mesos-master stderr | 2016-01-15 12:20:28,971:13(0x6c24d5093700):ZOO_INFO@check_events@1703: initiated connection to server [172.16.0.1:2181]
panteras_1 | mesos-master stderr | 2016-01-15 12:20:29,032:13(0x6c248bff0700):ZOO_INFO@check_events@1750: session establishment complete on server [172.16.0.1:2181], sessionId=0x152453ac77c0001, negotiated timeout=10000
panteras_1 | mesos-master stderr | I0115 12:20:29.032919 122 group.cpp:331] Group process (group(3)@172.16.0.1:5050) connected to ZooKeeper
panteras_1 | I0115 12:20:29.032955 122 group.cpp:805] Syncing group operations: queue size (joins, cancels, datas) = (1, 0, 0)
panteras_1 | I0115 12:20:29.032968 122 group.cpp:403] Trying to create path '/mesos' in ZooKeeper
panteras_1 | mesos-master stderr | 2016-01-15 12:20:29,091:13(0x6c248af21700):ZOO_INFO@check_events@1750: session establishment complete on server [172.16.0.1:2181], sessionId=0x152453ac77c0002, negotiated timeout=10000
panteras_1 | mesos-master stderr | I0115 12:20:29.091639 126 group.cpp:331] Group process (group(4)@172.16.0.1:5050) connected to ZooKeeper
panteras_1 | I0115 12:20:29.091676 126 group.cpp:805] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0)
panteras_1 | I0115 12:20:29.091687 126 group.cpp:403] Trying to create path '/mesos' in ZooKeeper
panteras_1 | 2016-01-15 12:20:29,091:13(0x6c24d598f700):ZOO_INFO@check_events@1750: session establishment complete on server [172.16.0.1:2181], sessionId=0x152453ac77c0003, negotiated timeout=10000
panteras_1 | I0115 12:20:29.092103 125 group.cpp:331] Group process (group(1)@172.16.0.1:5050) connected to ZooKeeper
panteras_1 | mesos-master stderr | I0115 12:20:29.092144 125 group.cpp:805] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0)
panteras_1 | I0115 12:20:29.092162 125 group.cpp:403] Trying to create path '/mesos/log_replicas' in ZooKeeper
panteras_1 | 2016-01-15 12:20:29,092:13(0x6c24d5093700):ZOO_INFO@check_events@1750: session establishment complete on server [172.16.0.1:2181], sessionId=0x152453ac77c0004, negotiated timeout=10000
panteras_1 | I0115 12:20:29.092689 132 group.cpp:331] Group process (group(2)@172.16.0.1:5050) connected to ZooKeeper
panteras_1 | I0115 12:20:29.092718 132 group.cpp:805] Syncing group operations: queue size (joins, cancels, datas) = (1, 0, 0)
panteras_1 | I0115 12:20:29.092731 132 group.cpp:403] Trying to create path '/mesos/log_replicas' in ZooKeeper
panteras_1 | mesos-slave stderr | 2016-01-15 12:20:29,222:19(0x757570d45700):ZOO_INFO@check_events@1703: initiated connection to server [172.16.0.1:2181]
panteras_1 | mesos-master stderr | I0115 12:20:29.298460 131 contender.cpp:265] New candidate (id='0') has entered the contest for leadership
panteras_1 | mesos-slave stderr | 2016-01-15 12:20:29,298:19(0x757570d45700):ZOO_INFO@check_events@1750: session establishment complete on server [172.16.0.1:2181], sessionId=0x152453ac77c0005, negotiated timeout=10000
panteras_1 | I0115 12:20:29.298987 149 group.cpp:331] Group process (group(1)@172.16.0.1:5051) connected to ZooKeeper
panteras_1 | I0115 12:20:29.299022 149 group.cpp:805] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0)
panteras_1 | I0115 12:20:29.299033 149 group.cpp:403] Trying to create path '/mesos' in ZooKeeper
panteras_1 | mesos-master stderr | I0115 12:20:29.347391 127 detector.cpp:156] Detected a new leader: (id='0')
panteras_1 | mesos-master stderr | I0115 12:20:29.347537 134 group.cpp:674] Trying to get '/mesos/json.info_0000000000' in ZooKeeper
panteras_1 | mesos-master stderr | I0115 12:20:29.397315 125 network.hpp:415] ZooKeeper group memberships changed
panteras_1 | mesos-master stderr | I0115 12:20:29.397406 127 group.cpp:674] Trying to get '/mesos/log_replicas/0000000000' in ZooKeeper
panteras_1 | mesos-master stderr | I0115 12:20:29.399984 120 detector.cpp:481] A new leading master (UPID=master@172.16.0.1:5050) is detected
panteras_1 | mesos-master stderr | I0115 12:20:29.400113 126 master.cpp:1603] The newly elected leader is master@172.16.0.1:5050 with id 0b51d279-d986-4d08-a6a5-4d5b451b4fb4
panteras_1 | I0115 12:20:29.400156 126 master.cpp:1616] Elected as the leading master!
panteras_1 | I0115 12:20:29.400177 126 master.cpp:1376] Recovering from registrar
panteras_1 | I0115 12:20:29.400388 120 registrar.cpp:309] Recovering registrar
panteras_1 | I0115 12:20:29.400454 130 network.hpp:463] ZooKeeper group PIDs: { log-replica(1)@172.16.0.1:5050 }
panteras_1 | I0115 12:20:29.400758 129 log.cpp:661] Attempting to start the writer
panteras_1 | mesos-master stderr | I0115 12:20:29.401873 128 replica.cpp:477] Replica received implicit promise request with proposal 1
panteras_1 | mesos-slave stderr | I0115 12:20:29.401967 140 detector.cpp:156] Detected a new leader: (id='0')
panteras_1 | I0115 12:20:29.402137 144 group.cpp:674] Trying to get '/mesos/json.info_0000000000' in ZooKeeper
panteras_1 | mesos-slave stderr | I0115 12:20:29.403504 143 detector.cpp:481] A new leading master (UPID=master@172.16.0.1:5050) is detected
panteras_1 | mesos-slave stderr | I0115 12:20:29.403707 149 slave.cpp:705] New master detected at master@172.16.0.1:5050
panteras_1 | I0115 12:20:29.403723 136 status_update_manager.cpp:176] Pausing sending status updates
panteras_1 | mesos-slave stderr | I0115 12:20:29.404088 149 slave.cpp:730] No credentials provided. Attempting to register without authentication
panteras_1 | mesos-slave stderr | I0115 12:20:29.424443 149 slave.cpp:741] Detecting new master
panteras_1 | mesos-master stderr | I0115 12:20:29.446482 128 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 44.567609ms
panteras_1 | I0115 12:20:29.446524 128 replica.cpp:345] Persisted promised to 1
panteras_1 | mesos-master stderr | I0115 12:20:29.446919 124 coordinator.cpp:231] Coordinator attemping to fill missing position
panteras_1 | mesos-master stderr | I0115 12:20:29.447505 121 replica.cpp:378] Replica received explicit promise request for position 0 with proposal 2
panteras_1 | mesos-master stderr | I0115 12:20:29.497473 121 leveldb.cpp:343] Persisting action (8 bytes) to leveldb took 49.934097ms
panteras_1 | I0115 12:20:29.497521 121 replica.cpp:679] Persisted action at 0
panteras_1 | mesos-master stderr | I0115 12:20:29.498147 133 replica.cpp:511] Replica received write request for position 0
panteras_1 | I0115 12:20:29.498189 133 leveldb.cpp:438] Reading position from leveldb took 24769ns
panteras_1 | mesos-master stderr | I0115 12:20:29.547634 133 leveldb.cpp:343] Persisting action (14 bytes) to leveldb took 49.415056ms
panteras_1 | I0115 12:20:29.547679 133 replica.cpp:679] Persisted action at 0
panteras_1 | mesos-master stderr | I0115 12:20:29.548081 127 replica.cpp:658] Replica received learned notice for position 0
panteras_1 | mesos-master stderr | I0115 12:20:29.597756 127 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took 49.64121ms
panteras_1 | I0115 12:20:29.597803 127 replica.cpp:679] Persisted action at 0
panteras_1 | I0115 12:20:29.597813 127 replica.cpp:664] Replica learned NOP action at position 0
panteras_1 | mesos-master stderr | I0115 12:20:29.598234 121 log.cpp:677] Writer started with ending position 0
panteras_1 | mesos-master stderr | I0115 12:20:29.598654 131 leveldb.cpp:438] Reading position from leveldb took 18662ns
panteras_1 | mesos-master stderr | I0115 12:20:29.600852 130 registrar.cpp:342] Successfully fetched the registry (0B) in 200.431104ms
panteras_1 | mesos-master stderr | I0115 12:20:29.600916 130 registrar.cpp:441] Applied 1 operations in 9141ns; attempting to update the 'registry'
panteras_1 | mesos-master stderr | I0115 12:20:29.602462 125 log.cpp:685] Attempting to append 192 bytes to the log
panteras_1 | mesos-master stderr | I0115 12:20:29.602515 121 coordinator.cpp:341] Coordinator attempting to write APPEND action at position 1
panteras_1 | I0115 12:20:29.602768 122 replica.cpp:511] Replica received write request for position 1
panteras_1 | mesos-master stderr | I0115 12:20:29.672976 122 leveldb.cpp:343] Persisting action (211 bytes) to leveldb took 70.176267ms
panteras_1 | I0115 12:20:29.673027 122 replica.cpp:679] Persisted action at 1
panteras_1 | mesos-master stderr | I0115 12:20:29.673508 130 replica.cpp:658] Replica received learned notice for position 1
panteras_1 | mesos-master stderr | I0115 12:20:29.723078 130 leveldb.cpp:343] Persisting action (213 bytes) to leveldb took 49.539067ms
panteras_1 | I0115 12:20:29.723145 130 replica.cpp:679] Persisted action at 1
panteras_1 | I0115 12:20:29.723160 130 replica.cpp:664] Replica learned APPEND action at position 1
panteras_1 | mesos-master stderr | I0115 12:20:29.723670 119 registrar.cpp:486] Successfully updated the 'registry' in 122720us
panteras_1 | mesos-master stderr | I0115 12:20:29.723737 119 registrar.cpp:372] Successfully recovered registrar
panteras_1 | I0115 12:20:29.723758 134 log.cpp:704] Attempting to truncate the log to 1
panteras_1 | I0115 12:20:29.723808 131 coordinator.cpp:341] Coordinator attempting to write TRUNCATE action at position 2
panteras_1 | I0115 12:20:29.723886 123 master.cpp:1413] Recovered 0 slaves from the Registry (153B) ; allowing 10mins for slaves to re-register
panteras_1 | mesos-master stderr | I0115 12:20:29.724287 121 replica.cpp:511] Replica received write request for position 2
panteras_1 | mesos-master stderr | I0115 12:20:29.773228 121 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took 48.894283ms
panteras_1 | I0115 12:20:29.773278 121 replica.cpp:679] Persisted action at 2
panteras_1 | mesos-master stderr | I0115 12:20:29.773656 129 replica.cpp:658] Replica received learned notice for position 2
panteras_1 | mesos-master stderr | I0115 12:20:29.823321 129 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 49.63919ms
panteras_1 | I0115 12:20:29.823385 129 leveldb.cpp:401] Deleting ~1 keys from leveldb took 25735ns
panteras_1 | I0115 12:20:29.823400 129 replica.cpp:679] Persisted action at 2
panteras_1 | mesos-master stderr | I0115 12:20:29.823410 129 replica.cpp:664] Replica learned TRUNCATE action at position 2
panteras_1 | marathon stdout | [2016-01-15 12:20:29,824] WARN Method [public javax.ws.rs.core.Response mesosphere.marathon.api.MarathonExceptionMapper.toResponse(java.lang.Throwable)] is synthetic and is being intercepted by [mesosphere.marathon.DebugModule$MetricsBehavior@551de37d]. This could indicate a bug. The method may be intercepted twice, or may not be intercepted at all. (com.google.inject.internal.ProxyFactory:main)
panteras_1 | marathon stdout | [2016-01-15 12:20:30,061] INFO Logging initialized @4721ms (org.eclipse.jetty.util.log:main)
panteras_1 | consul-template_haproxy stderr | 2016/01/15 12:20:30 [ERR] (view) "services" catalog services: error fetching: Get http://159.203.300.283:8500/v1/catalog/services?wait=60000ms: dial tcp 159.203.300.283:8500: connection refused
panteras_1 | 2016/01/15 12:20:30 [ERR] (runner) watcher reported error: catalog services: error fetching: Get http://159.203.300.283:8500/v1/catalog/services?wait=60000ms: dial tcp 159.203.300.283:8500: connection refused
panteras_1 | mesos-master stderr | I0115 12:20:30.386616 122 master.cpp:3862] Registering slave at slave(1)@172.16.0.1:5051 (master-01.domain.com) with id 0b51d279-d986-4d08-a6a5-4d5b451b4fb4-S0
panteras_1 | mesos-master stderr | I0115 12:20:30.386785 132 registrar.cpp:441] Applied 1 operations in 27383ns; attempting to update the 'registry'
panteras_1 | mesos-master stderr | I0115 12:20:30.388008 126 log.cpp:685] Attempting to append 372 bytes to the log
panteras_1 | mesos-master stderr | I0115 12:20:30.388084 130 coordinator.cpp:341] Coordinator attempting to write APPEND action at position 3
panteras_1 | I0115 12:20:30.388363 130 replica.cpp:511] Replica received write request for position 3
panteras_1 | mesos-master stderr | I0115 12:20:30.440152 130 leveldb.cpp:343] Persisting action (391 bytes) to leveldb took 51.752769ms
panteras_1 | I0115 12:20:30.440217 130 replica.cpp:679] Persisted action at 3
panteras_1 | mesos-master stderr | I0115 12:20:30.440564 122 replica.cpp:658] Replica received learned notice for position 3
panteras_1 | mesos-master stderr | I0115 12:20:30.490267 122 leveldb.cpp:343] Persisting action (393 bytes) to leveldb took 49.672529ms
panteras_1 | I0115 12:20:30.490317 122 replica.cpp:679] Persisted action at 3
panteras_1 | I0115 12:20:30.490327 122 replica.cpp:664] Replica learned APPEND action at position 3
panteras_1 | mesos-master stderr | I0115 12:20:30.490761 120 registrar.cpp:486] Successfully updated the 'registry' in 103.934976ms
panteras_1 | I0115 12:20:30.490959 132 log.cpp:704] Attempting to truncate the log to 3
panteras_1 | I0115 12:20:30.491013 121 coordinator.cpp:341] Coordinator attempting to write TRUNCATE action at position 4
panteras_1 | I0115 12:20:30.491272 122 master.cpp:3930] Registered slave 0b51d279-d986-4d08-a6a5-4d5b451b4fb4-S0 at slave(1)@172.16.0.1:5051 (master-01.domain.com) with cpus(*):16; mem(*):31104; disk(*):14436; ports(*):[31000-32000]
panteras_1 | I0115 12:20:30.491351 122 replica.cpp:511] Replica received write request for position 4
panteras_1 | mesos-master stderr | I0115 12:20:30.491384 133 hierarchical.hpp:675] Added slave 0b51d279-d986-4d08-a6a5-4d5b451b4fb4-S0 (master-01.domain.com) with cpus(*):16; mem(*):31104; disk(*):14436; ports(*):[31000-32000] (allocated: )
panteras_1 | I0115 12:20:30.492272 127 master.cpp:4272] Received update of slave 0b51d279-d986-4d08-a6a5-4d5b451b4fb4-S0 at slave(1)@172.16.0.1:5051 (master-01.domain.com) with total oversubscribed resources
panteras_1 | mesos-slave stderr | I0115 12:20:30.491698 141 slave.cpp:880] Registered with master master@172.16.0.1:5050; given slave ID 0b51d279-d986-4d08-a6a5-4d5b451b4fb4-S0
panteras_1 | I0115 12:20:30.491788 147 status_update_manager.cpp:183] Resuming sending status updates
panteras_1 | I0115 12:20:30.492040 141 slave.cpp:939] Forwarding total oversubscribed resources
panteras_1 | mesos-master stderr | I0115 12:20:30.492389 119 hierarchical.hpp:735] Slave 0b51d279-d986-4d08-a6a5-4d5b451b4fb4-S0 (master-01.domain.com) updated with oversubscribed resources (total: cpus(*):16; mem(*):31104; disk(*):14436; ports(*):[31000-32000], allocated: )
panteras_1 | mesos-master stderr | I0115 12:20:30.549007 122 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took 57.628103ms
panteras_1 | I0115 12:20:30.549057 122 replica.cpp:679] Persisted action at 4
panteras_1 | mesos-master stderr | I0115 12:20:30.549438 119 replica.cpp:658] Replica received learned notice for position 4
panteras_1 | mesos-master stderr | I0115 12:20:30.593732 119 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 44.253118ms
panteras_1 | I0115 12:20:30.593806 119 leveldb.cpp:401] Deleting ~2 keys from leveldb took 34380ns
panteras_1 | I0115 12:20:30.593816 119 replica.cpp:679] Persisted action at 4
panteras_1 | I0115 12:20:30.593823 119 replica.cpp:664] Replica learned TRUNCATE action at position 4
panteras_1 | consul stdout | ==> Newer Consul version available: 0.6.2
panteras_1 | marathon stdout | [2016-01-15 12:20:30,835] INFO Slf4jLogger started (akka.event.slf4j.Slf4jLogger:marathon-akka.actor.default-dispatcher-4)
panteras_1 | marathon stdout | [2016-01-15 12:20:30,975] INFO Registering in Zookeeper with hostPort:master-01.domain.com:8080 (mesosphere.marathon.MarathonModule:main)
panteras_1 | marathon stdout | [2016-01-15 12:20:31,028] INFO Calling reviveOffers is enabled. Use --disable_revive_offers_for_new_apps to disable. (mesosphere.marathon.core.flow.FlowModule:main)
panteras_1 | marathon stdout | [2016-01-15 12:20:31,143] INFO Started status update processor with steps:
panteras_1 | * continueOnError(notifyHealthCheckManager})
panteras_1 | * continueOnError(notifyRateLimiter})
panteras_1 | * updateTaskTracker
panteras_1 | * continueOnError(notifyLaunchQueue})
panteras_1 | * continueOnError(emitUpdate})
panteras_1 | * continueOnError(postTaskStatusEvent})
panteras_1 | * continueOnError(scaleApp})
panteras_1 | * acknowledgeTaskUpdate (mesosphere.marathon.core.task.tracker.impl.TaskStatusUpdateProcessorImpl$$EnhancerByGuice$$67f79f3d:main)
panteras_1 | marathon stdout | [2016-01-15 12:20:31,152] INFO All actors suspended:
panteras_1 | * Actor[akka://marathon/user/killOverdueStagedTasks#341837145]
panteras_1 | * Actor[akka://marathon/user/offerMatcherManager#635592963]
panteras_1 | * Actor[akka://marathon/user/rateLimiter#307421446]
panteras_1 | * Actor[akka://marathon/user/reviveOffersWhenWanted#377753021]
panteras_1 | * Actor[akka://marathon/user/offerMatcherLaunchTokens#296233689]
panteras_1 | * Actor[akka://marathon/user/launchQueue#141697808] (mesosphere.marathon.core.leadership.impl.LeadershipCoordinatorActor:marathon-akka.actor.default-dispatcher-5)
panteras_1 | marathon stdout | [2016-01-15 12:20:31,293] INFO Now standing by. Closing existing handles and rejecting new. (mesosphere.marathon.event.http.HttpEventStreamActor:marathon-akka.actor.default-dispatcher-4)
panteras_1 | marathon stdout | [2016-01-15 12:20:31,344] INFO Adding HTTP support. (mesosphere.chaos.http.HttpModule:main)
panteras_1 | marathon stdout | [2016-01-15 12:20:31,345] INFO No HTTPS support configured. (mesosphere.chaos.http.HttpModule:main)
panteras_1 | marathon stdout | [2016-01-15 12:20:31,385] INFO jetty-9.3.z-SNAPSHOT (org.eclipse.jetty.server.Server:HttpService$$EnhancerByGuice$$6f76f5a6 STARTING)
panteras_1 | marathon stdout | [2016-01-15 12:20:31,385] INFO Starting up (mesosphere.marathon.MarathonSchedulerService$$EnhancerByGuice$$ca9226be:MarathonSchedulerService$$EnhancerByGuice$$ca9226be)
panteras_1 | marathon stdout | [2016-01-15 12:20:31,389] INFO Beginning run (mesosphere.marathon.MarathonSchedulerService$$EnhancerByGuice$$ca9226be:MarathonSchedulerService$$EnhancerByGuice$$ca9226be)
panteras_1 | marathon stdout | [2016-01-15 12:20:31,390] INFO Will offer leadership after 500 milliseconds backoff (mesosphere.marathon.MarathonSchedulerService$$EnhancerByGuice$$ca9226be:MarathonSchedulerService$$EnhancerByGuice$$ca9226be)
panteras_1 | marathon stdout | [2016-01-15 12:20:31,635] INFO Registering com.codahale.metrics.jersey.InstrumentedResourceMethodDispatchAdapter as a provider class (com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory:HttpService$$EnhancerByGuice$$6f76f5a6 STARTING)
panteras_1 | marathon stdout | [2016-01-15 12:20:31,639] INFO Registering mesosphere.marathon.api.MarathonExceptionMapper as a provider class (com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory:HttpService$$EnhancerByGuice$$6f76f5a6 STARTING)
panteras_1 | marathon stdout | [2016-01-15 12:20:31,640] INFO Registering mesosphere.marathon.api.v2.AppsResource as a root resource class (com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory:HttpService$$EnhancerByGuice$$6f76f5a6 STARTING)
panteras_1 | marathon stdout | [2016-01-15 12:20:31,640] INFO Registering mesosphere.marathon.api.v2.TasksResource as a root resource class (com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory:HttpService$$EnhancerByGuice$$6f76f5a6 STARTING)
panteras_1 | [2016-01-15 12:20:31,640] INFO Registering mesosphere.marathon.api.v2.EventSubscriptionsResource as a root resource class (com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory:HttpService$$EnhancerByGuice$$6f76f5a6 STARTING)
panteras_1 | [2016-01-15 12:20:31,640] INFO Registering mesosphere.marathon.api.v2.QueueResource as a root resource class (com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory:HttpService$$EnhancerByGuice$$6f76f5a6 STARTING)
panteras_1 | [2016-01-15 12:20:31,640] INFO Registering mesosphere.marathon.api.v2.GroupsResource as a root resource class (com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory:HttpService$$EnhancerByGuice$$6f76f5a6 STARTING)
panteras_1 | [2016-01-15 12:20:31,640] INFO Registering mesosphere.marathon.api.v2.InfoResource as a root resource class (com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory:HttpService$$EnhancerByGuice$$6f76f5a6 STARTING)
panteras_1 | marathon stdout | [2016-01-15 12:20:31,641] INFO Registering mesosphere.marathon.api.v2.LeaderResource as a root resource class (com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory:HttpService$$EnhancerByGuice$$6f76f5a6 STARTING)
panteras_1 | [2016-01-15 12:20:31,641] INFO Registering mesosphere.marathon.api.v2.DeploymentsResource as a root resource class (com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory:HttpService$$EnhancerByGuice$$6f76f5a6 STARTING)
panteras_1 | [2016-01-15 12:20:31,641] INFO Registering mesosphere.marathon.api.v2.ArtifactsResource as a root resource class (com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory:HttpService$$EnhancerByGuice$$6f76f5a6 STARTING)
panteras_1 | [2016-01-15 12:20:31,641] INFO Registering mesosphere.marathon.api.v2.SchemaResource as a root resource class (com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory:HttpService$$EnhancerByGuice$$6f76f5a6 STARTING)
panteras_1 | marathon stdout | [2016-01-15 12:20:31,644] INFO Initiating Jersey application, version 'Jersey: 1.18.1 02/19/2014 03:28 AM' (com.sun.jersey.server.impl.application.WebApplicationImpl:HttpService$$EnhancerByGuice$$6f76f5a6 STARTING)
panteras_1 | marathon stdout | [2016-01-15 12:20:31,698] INFO Binding com.codahale.metrics.jersey.InstrumentedResourceMethodDispatchAdapter to GuiceManagedComponentProvider with the scope "Singleton" (com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory:HttpService$$EnhancerByGuice$$6f76f5a6 STARTING)
panteras_1 | marathon stdout | [2016-01-15 12:20:31,716] INFO Binding mesosphere.marathon.api.MarathonExceptionMapper to GuiceManagedComponentProvider with the scope "Singleton" (com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory:HttpService$$EnhancerByGuice$$6f76f5a6 STARTING)
panteras_1 | 2016-01-15 12:20:31,913 INFO spawned: 'registrator' with pid 253
panteras_1 | marathon stdout | [2016-01-15 12:20:31,910] INFO Using HA and therefore offering leadership (mesosphere.marathon.MarathonSchedulerService$$EnhancerByGuice$$ca9226be:pool-1-thread-2)
panteras_1 | registrator stderr | 2016/01/15 12:20:31 Starting registrator ...
panteras_1 | 2016/01/15 12:20:31 Forcing host IP to 159.203.300.283
panteras_1 | registrator stderr | 2016/01/15 12:20:31 consul: Get http://159.203.300.283:8500/v1/status/leader: dial tcp 159.203.300.283:8500: connection refused
panteras_1 | 2016-01-15 12:20:31,928 INFO exited: registrator (exit status 1; not expected)
I don't see Zookeeper errors anymore.
So please check if all services are still working.
supervisorctl status
deploy example service & check if you can reach it:
curl -H 'Host: python.service.consul' http://<LISTEN_IP>
root@master-01:~/panteras# docker exec -ti panteras_panteras_1 bash
root@master-01:/opt# supervisorctl status
chronos STOPPED Not started
consul RUNNING pid 10, uptime 0:57:33
consul-template_haproxy RUNNING pid 12, uptime 0:57:33
dnsmasq RUNNING pid 9, uptime 0:57:33
marathon RUNNING pid 17, uptime 0:57:33
mesos-master RUNNING pid 13, uptime 0:57:33
mesos-slave RUNNING pid 19, uptime 0:57:33
registrator BACKOFF Exited too quickly (process log may have details)
stdout RUNNING pid 8, uptime 0:57:33
zookeeper RUNNING pid 11, uptime 0:57:33
root@master-01:/opt# curl -H 'Host: marathon.service.consul' http://172.16.0.1
curl: (7) Failed to connect to 172.16.0.1 port 80: Connection refused
root@master-01:/opt# cat /etc/resolv.conf
# nameserver 213.186.33.99
# search ovh.net
search domain.com consul
nameserver 172.16.0.1
nameserver 8.8.8.8
root@master-01:/opt# cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
# 159.203.300.283 master-01.domain.com
# 2001:41d0:1000:08b7:: master-01.domain.com
172.16.0.1 master-01.domain.com
172.16.0.2 master-02.domain.com
172.16.0.3 master-03.domain.com
# The following lines are desirable for IPv6 capable hosts
#(added automatically by netbase upgrade)
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
root@master-01:/opt# curl marathon.service.consul/v2/leader
curl: (7) Failed to connect to marathon.service.consul port 80: Connection refused
With python example :
root@master-01:~/panteras/examples/SimpleWebappPython# IP=172.16.0.1 ./start_with_marathon.sh deploy0_marathon.json
Start a new one
{"id":"/python-example-stable","cmd":"echo python stable `hostname` > index.html; python3 -m http.server 8080","args":null,"user":null,"env":{"SERVICE_TAGS":"haproxy,http,weight=100,haproxy_route=/media,haproxy_route=/chat","SERVICE_NAME":"python"},"instances":2,"cpus":0.1,"mem":16,"disk":0,"executor":"","constraints":[],"uris":[],"storeUrls":[],"ports":[0],"requirePorts":false,"backoffSeconds":1,"backoffFactor":1.15,"maxLaunchDelaySeconds":3600,"container":{"type":"DOCKER","volumes":[],"docker":{"image":"ubuntu:14.04","network":"BRIDGE","portMappings":[{"containerPort":8080,"hostPort":0,"servicePort":0,"protocol":"tcp"}],"privileged":false,"parameters":[],"forcePullImage":false}},"healthChecks":[{"protocol":"TCP","portIndex":0,"gracePeriodSeconds":30,"intervalSeconds":10,"timeoutSeconds":30,"maxConsecutiveFailures":3,"ignoreHttp1xx":false},{"path":"/","protocol":"HTTP","portIndex":0,"gracePeriodSeconds":30,"intervalSeconds":10,"timeoutSeconds":30,"maxConsecutiveFailures":3,"ignoreHttp1xx":false}],"dependencies":[],"upgradeStrategy":{"minimumHealthCapacity":1,"maximumOverCapacity":1},"labels":{},"acceptedResourceRoles":null,"version":"2016-01-15T13:22:19.779Z","tasksStaged":0,"tasksRunning":0,"tasksHealthy":0,"tasksUnhealthy":0,"deployments":[{"id":"7dbff8f0-7898-413b-baa5-2f426d714f39"}],"tasks":[]}root@master-01:~/panteras/examples/SimpleWebappPython#
root@master-01:~/panteras/examples/SimpleWebappPython#
root@master-01:~/panteras/examples/SimpleWebappPython#
root@master-01:~/panteras/examples/SimpleWebappPython# curl -H 'Host: python.service.consul' http://172.16.0.1
curl: (7) Failed to connect to 172.16.0.1 port 80: Connection refused
root@master-01:~/panteras/examples/SimpleWebappPython# curl -H 'Host: python.service.consul' http://172.16.0.1
curl: (7) Failed to connect to 172.16.0.1 port 80: Connection refused
Short note to remember, example of configuration for PanteraS on specific LISTEN_IP and CONSUL ans DNS:
IP=172.16.0.1
LISTEN_IP=172.16.0.1
START_DNSMASQ=false
CONSUL_PARAMS="${CONSUL_PARAMS} -retry-join-wan=your-2nd-host.com"
CONSUL_PARAMS="${CONSUL_PARAMS} -config-dir=/etc/consul.d/"
CONSUL_PARAMS="${CONSUL_PARAMS} -recursor=8.8.8.8"
CONSUL_DC=your_dc_name
todo: write documentation
I was wondering about this one :
CONSUL_PARAMS="${CONSUL_PARAMS} -retry-join-wan=your-2nd-host.com"
if this should be on every master or only on nodes you are connecting to dc 2? (Common or host)
It depends on you, and your topology.
If you have just 3 data centers you can pickup masters (where you have wan ) you want to connect to, and do "everyone to everyone", so pair per master.
Keep in mind that --retry-join-wan
is different than --retry-join
.
If you have only one datacenter and multiple masters use second one.
1 masters as a vpn tunnel to another DC so it will be 1 master to 1 master where i have 3 masters per DC.
--retry-join
is the equivalent of --join
arguments, it's already generated by Panteras.
Yes tunnel is enough for testing. In prod you should have some redundancy definitely. (sorry for late answer)
I have a host with 2 phyisical network interface : public & private.
I want to start PanteraS from master branch on a single node master+slave using the following:
If I remove LISTEN_IP, services all starts normally.
Here is the log file :
Any idea why zookeeper refuse connection ? I also tried to disable chronos with
START_CHRONOS="false"
but it didn't work