Closed absolute8511 closed 7 years ago
Thanks for reporting this issue. Looks like an enhancement to access-address setting in our latest build (version 3.3.21) affected clustering in a docker environment. We are working on a fix in the next release. A current work-around is to use release 3.3.17
After cloning the repo, you can checkout 3.3.17 :
git checkout 3.3.17
I tried 3.3.17 again, it can start success. However, it seems not aware of the other node in the cluster. In the amc dashboard, I can see only 1 node in the cluster. My configuration about the cluster is using the single seed node 10.10.99.129 and there are two docker containers deployed on 10.10.99.128 and 10.10.99.129. The log as below: On node 10.10.99.129
Nov 19 2014 02:52:08 GMT: INFO (paxos): (paxos.c::2212) Cluster Integrity Check: Detected succession list discrepancy between node bb9130011ac4202 and self bb90d4e7d0b5cbe
Nov 19 2014 02:52:08 GMT: INFO (paxos): (paxos.c::2259) CLUSTER INTEGRITY FAULT. [Phase 1 of 2] To fix, issue this command across all nodes: dun:nodes=bb9130011ac4202,bb90d4e7d0b5cbe
Nov 19 2014 02:52:09 GMT: INFO (hb): (hb.c::2212) HB node bb9130011ac4202 in different cluster - succession lists don't match
Nov 19 2014 02:52:13 GMT: INFO (paxos): (paxos.c::2212) Cluster Integrity Check: Detected succession list discrepancy between node bb9130011ac4202 and self bb90d4e7d0b5cbe
Nov 19 2014 02:52:13 GMT: INFO (paxos): (paxos.c::2259) CLUSTER INTEGRITY FAULT. [Phase 1 of 2] To fix, issue this command across all nodes: dun:nodes=bb9130011ac4202,bb90d4e7d0b5cbe
Nov 19 2014 02:52:15 GMT: INFO (info): (thr_info.c::4440) system memory: free 64869880kb ( 98 percent free )
on node 10.10.99.128
Nov 19 2014 02:55:31 GMT: INFO (hb): (hb.c::2212) HB node bb90d4e7d0b5cbe in different cluster - succe
ssion lists don't match
Nov 19 2014 02:55:32 GMT: INFO (paxos): (paxos.c::2212) Cluster Integrity Check: Detected succession list discrepancy between node bb90d4e7d0b5cbe and self bb9130011ac4202
Nov 19 2014 02:55:32 GMT: INFO (paxos): (paxos.c::2259) CLUSTER INTEGRITY FAULT. [Phase 1 of 2] To fix, issue this command across all nodes: dun:nodes=bb90d4e7d0b5cbe
Nov 19 2014 02:55:32 GMT: INFO (paxos): (paxos.c::2382) as_paxos_retransmit_check: principal bb9130011ac4202 retransmitting sync messages to nodes that have not responded yet ...
Nov 19 2014 02:55:32 GMT: INFO (paxos): (paxos.c::1402) sending sync message to bb90d4e7d0b5cbe
Nov 19 2014 02:55:32 GMT: INFO (paxos): (paxos.c::1411) SUCCESSION [1.0]: bb9130011ac4202 bb90d4e7d0b5cbe
Nov 19 2014 02:55:34 GMT: INFO (drv_ssd): (drv_ssd.c::2392) device /opt/aerospike/data/test.dat: used 0, contig-free 4095M (4095 wblocks), swb-free 0, w-q 0 w-tot 0 (0.0/s), defrag-q 0 defrag-tot 0 (0.0/s)
Nov 19 2014 02:55:37 GMT: INFO (paxos): (paxos.c::2212) Cluster Integrity Check: Detected succession list discrepancy between node bb90d4e7d0b5cbe and self bb9130011ac4202
Nov 19 2014 02:55:37 GMT: INFO (paxos): (paxos.c::2259) CLUSTER INTEGRITY FAULT. [Phase 1 of 2] To fix, issue this command across all nodes: dun:nodes=bb90d4e7d0b5cbe
Nov 19 2014 02:55:37 GMT: INFO (paxos): (paxos.c::2382) as_paxos_retransmit_check: principal bb9130011ac4202 retransmitting sync messages to nodes that have not responded yet ...
Nov 19 2014 02:55:37 GMT: INFO (paxos): (paxos.c::1402) sending sync message to bb90d4e7d0b5cbe
Nov 19 2014 02:55:37 GMT: INFO (paxos): (paxos.c::1411) SUCCESSION [1.0]: bb9130011ac4202 bb90d4e7d0b5cbe
Nov 19
Would it be possible for you to post your config file?
Also do you have a container per host sever? Meaning do you have a different access-address for each of the containers?
Also verify that the two container servers can communicate over port 3002.
You could try telnet from one host to the other to verify connectivity..
Also where you able to use the tip command.
asinfo -h Node_IP_ADDR -v 'tip:host=Docker_host_ADDR;port=3002'
config on 10.10.99.128 is:
network {
service {
address any
port 3000
access-address 10.10.99.128
}
heartbeat {
# mesh is used for environments that do not support multicast
mode mesh
port 3002
mesh-address 10.10.99.129 3002
mesh-port 3002
interval 150
timeout 20
}
......
}
config on 10.10.99.129 is the same except the access-address is setting to 10.10.99.129.
I am running a container on each host server and I am using the docker "host" network mode. So the IP is just the docker host machine IP. The telnet showed the connectivity on port 3002 is OK.
Hi Vincent,
Issue may be with your mesh config.
For Aerospike server version 3.3.17 try using:
mesh-address 10.10.99.129
instead of
mesh-address 10.10.99.129 3002
When using mesh-address
you do not need to specify 3002.
Please see following link for info on heartbeat configuration: http://www.aerospike.com/docs/operations/configure/network/heartbeat/
in version greater then Aerospike version 3.3.19 a new setting was added to support multiple seed servers. mesh-seed-address-port 192.168.1.101 3002
When using mesh-address you do not need to specify the port.
best,
Lucien
Hi Vincent,
I was able to bring up a Aerospike 3.3.17 mesh cluster on EC2 by running the following:
sudo docker run --net=host -tid -v /home/ubuntu/aerospike-server.docker:/opt/aerospike/etc --name aerospike -p 3000:3000 -p 3001:3001 -p 3002:3002 -p 3003:3003 aerospike/aerospike-server --config-file /opt/aerospike/etc/aerospike.conf
My host IP address was : 172.31.12.80 And the second node IP address was : 172.31.15.184
Here is the config file I used:
ubuntu@ip-172-31-12-80:~/aerospike-server.docker$ cat aerospike.conf
# Aerospike database configuration file.
# This stanza must come first.
service {
user root
group root
paxos-single-replica-limit 1 # Number of nodes where the replica count is automatically reduced to 1.
pidfile /var/run/aerospike/asd.pid
service-threads 4
transaction-queues 4
transaction-threads-per-queue 4
proto-fd-max 15000
}
logging {
# Log file must be an absolute path.
file /var/log/aerospike/aerospike.log {
context any info
}
# Send log messages to stdout
console {
context any info
}
}
network {
service {
address any
port 3000
# Uncomment the following to set the `access-address` parameter to the
# IP address of the Docker host. This will the allow the server to correctly
# publish the address which applications and other nodes in the cluster to
# use when addressing this node.
access-address 172.31.12.80
}
heartbeat {
# mesh is used for environments that do not support multicast
mode mesh
port 3002
# use asinfo -v 'tip:host=<ADDR>;port=3002' to inform cluster of
# other mesh nodes
mesh-address 172.31.15.184
mesh-port 3002
interval 150
timeout 10
}
fabric {
port 3001
}
info {
port 3003
}
}
namespace test {
replication-factor 2
memory-size 1G
default-ttl 5d # 30 days, use 0 to never expire/evict.
# storage-engine memory
# To use file storage backing, comment out the line above and use the
# following lines instead.
storage-engine device {
file /opt/aerospike/data/test.dat
filesize 4G
data-in-memory true # Store data in memory in addition to file.
}
}
And ifconfig of host server:
ubuntu@ip-172-31-12-80:~/aerospike-server.docker$ ifconfig
docker0 Link encap:Ethernet HWaddr 56:84:7a:fe:97:99
inet addr:172.17.42.1 Bcast:0.0.0.0 Mask:255.255.0.0
inet6 addr: fe80::5484:7aff:fefe:9799/64 Scope:Link
UP BROADCAST MULTICAST MTU:9001 Metric:1
RX packets:151962 errors:0 dropped:0 overruns:0 frame:0
TX packets:127201 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:33088783 (33.0 MB) TX bytes:54180808 (54.1 MB)
eth0 Link encap:Ethernet HWaddr 06:1f:17:90:a2:1e
inet addr:172.31.12.80 Bcast:172.31.15.255 Mask:255.255.240.0
inet6 addr: fe80::41f:17ff:fe90:a21e/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:9001 Metric:1
RX packets:405626 errors:0 dropped:0 overruns:0 frame:0
TX packets:213043 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:349242349 (349.2 MB) TX bytes:41477184 (41.4 MB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:232 errors:0 dropped:0 overruns:0 frame:0
TX packets:232 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:13903 (13.9 KB) TX bytes:13903 (13.9 KB)
Running docker with --net=host
Tells Docker to skip placing the container inside of a separate network stack. This allows you to use the host ip address as the access-address
IP address.
Cluster setup can be verified from command line by installing aerospike-tools on one of your hosts: (Assuming an ubuntu host)
wget 'http://www.aerospike.com/download/server/3.3.22/artifact/ubuntu12'
tar xvf ubuntu12
cd aerospike-server-community-3.3.22-ubuntu12.04/
sudo ./asinstall
sudo asmonitor -e info
We have a fix that will soon be released for the access-address restriction for NAT ip addresses issue affecting version 3.3.21. More info to come soon.
Hope this helps.
best, Lucien
Still no luck. The mesh-address config is OK and I can see the establishing on the 3002 port by using the tcpdump. My OS is centos 7 and docker version is 1.3.0. part of tcpdump output as below: tcpdump -i any port 3002
00:19:51.336825 IP 10.10.99.128.58248 > 10.10.99.129.exlm-agent: Flags [P.], seq 71191:71530, ack 71529, win 331, options [nop,nop,TS val 760997486 ecr 771455723], length 339
00:19:51.337056 IP 10.10.99.129.exlm-agent > 10.10.99.128.58248: Flags [.], ack 71530, win 330, options [nop,nop,TS val 771455774 ecr 760997486], length 0
00:19:51.437356 IP 10.10.99.129.exlm-agent > 10.10.99.128.58248: Flags [P.], seq 71529:71868, ack 71530, win 330, options [nop,nop,TS val 771455874 ecr 760997486], length 339
00:19:51.437469 IP 10.10.99.128.58248 > 10.10.99.129.exlm-agent: Flags [.], ack 71868, win 331, options [nop,nop,TS val 760997587 ecr 771455874], length 0
00:19:51.487630 IP 10.10.99.128.58248 > 10.10.99.129.exlm-agent: Flags [P.], seq 71530:71869, ack 71868, win 331, options [nop,nop,TS val 760997637 ecr 771455874], length 339
00:19:51.487794 IP 10.10.99.129.exlm-agent > 10.10.99.128.58248: Flags [.], ack 71869, win 330, options [nop,nop,TS val 771455925 ecr 760997637], length 0
00:19:51.588036 IP 10.10.99.129.exlm-agent > 10.10.99.128.58248: Flags [P.], seq 71868:72207, ack 71869, win 330, options [nop,nop,TS val 771456025 ecr 760997637], length 339
00:19:51.588147 IP 10.10.99.128.58248 > 10.10.99.129.exlm-agent: Flags [.], ack 72207, win 331, options [nop,nop,TS val 760997738 ecr 771456025], length 0
00:19:51.638300 IP 10.10.99.128.58248 > 10.10.99.129.exlm-agent: Flags [P.], seq 71869:72208, ack 72207, win 331, options [nop,nop,TS val 760997788 ecr 771456025], length 339
00:19:51.638459 IP 10.10.99.129.exlm-agent > 10.10.99.128.58248: Flags [.], ack 72208, win 330, options [nop,nop,TS val 771456075 ecr 760997788], length 0
00:19:51.738566 IP 10.10.99.129.exlm-agent > 10.10.99.128.58248: Flags [P.], seq 72207:72546, ack 72208, win 330, options [nop,nop,TS val 771456175 ecr 760997788], length 339
00:19:51.738666 IP 10.10.99.128.58248 > 10.10.99.129.exlm-agent: Flags [.], ack 72546, win 331, options [nop,nop,TS val 760997888 ecr 771456175], length 0
00:19:51.788827 IP 10.10.99.128.58248 > 10.10.99.129.exlm-agent: Flags [P.], seq 72208:72547, ack 72546, win 331, options [nop,nop,TS val 760997938 ecr 771456175], length 339
00:19:51.788988 IP 10.10.99.129.exlm-agent > 10.10.99.128.58248: Flags [.], ack 72547, win 330, options [nop,nop,TS val 771456226 ecr 760997938], length 0
Could you send us your exact command for running docker? And describe your environment's network. How many network hops between these two hosts?
Is this running on AWS or are you using bare metal?
Also please double check the IP address of the container by running:
sudo docker inspect -f '{{ .NetworkSettings.IPAddress }}' CONTAINER_NAME
The error you sent would be the sign of IP address changing within the container which should not be the case if using --net=host
--Lucien
I am running on my own server machines and they are in the same subset. (which mean only 1 hop.)
[root@server-128 ~]# traceroute 10.10.99.129
traceroute to 10.10.99.129 (10.10.99.129), 30 hops max, 60 byte packets
1 10.10.99.238 (10.10.99.238) 0.276 ms 2.687 ms 2.763 ms
2 10.10.99.129 (10.10.99.129) 0.226 ms 0.129 ms 0.120 ms
docker inspect showed:
"HostConfig": {
"Binds": null,
"CapAdd": null,
"CapDrop": null,
"ContainerIDFile": "",
"Devices": null,
"Dns": null,
"DnsSearch": null,
"ExtraHosts": null,
"Links": null,
"LxcConf": null,
"NetworkMode": "host",
"PortBindings": null,
"Privileged": true,
"PublishAllPorts": false,
"RestartPolicy": {
"MaximumRetryCount": 3,
"Name": "on-failure"
},
"VolumesFrom": null
},
...
"NetworkSettings": {
"Bridge": "",
"Gateway": "",
"IPAddress": "",
"IPPrefixLen": 0,
"MacAddress": "",
"PortMapping": null,
"Ports": null
},
Since the network mode is host, I think it is normal the network is missing.
and the starting log on 10.10.99.128:
Nov 25 2014 10:59:02 GMT: INFO (as): (as.c::357) <><><><><><><><><><> Aerospike Community Edition bui
ld 3.3.17 <><><><><><><><><><>
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) # Aerospike database configuration file.
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) # This stanza must come first.
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) service {
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) user root
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) group root
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) paxos-single-replica-limit 1 # Number of nodes
where the replica count is automatically reduced to 1.
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) pidfile /var/run/aerospike/asd.pid
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) service-threads 4
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) transaction-queues 4
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) transaction-threads-per-queue 4
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) proto-fd-max 15000
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) }
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) logging {
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) # Log file must be an absolute path.
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) file /var/log/aerospike/aerospike.log {
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) context any info
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) }
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) # Send log messages to stdout
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) console {
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) context any info
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) }
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) }
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) network {
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) service {
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) address any
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) port 3000
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) # Uncomment the following to set the parameter
to the
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) # IP address of the Docker host. This will the
allow the server to correctly
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) # publish the address which applications and ot
her nodes in the cluster to
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) # use when addressing this node.
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) access-address 10.10.99.128
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) }
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) heartbeat {
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) # mesh is used for environments that do not sup
port multicast
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) mode mesh
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) port 3002
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) # use asinfo -v 'tip:host=<ADDR>;port=3002' to
inform cluster of
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) # other mesh nodes
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) mesh-address 10.10.99.129
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) mesh-port 3002
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) interval 150
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) timeout 20
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) }
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) fabric {
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) port 3001
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) }
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) info {
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) port 3003
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) }
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) }
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) namespace test {
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) replication-factor 2
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) memory-size 1G
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) default-ttl 5d # 30 days, use 0 to never expire
/evict.
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) # storage-engine memory
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) # To use file storage backing, comment out the
line above and use the
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) # following lines instead.
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) storage-engine device {
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) file /opt/aerospike/data/test.dat
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) filesize 4G
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) data-in-memory true # Store data in memory in a
ddition to file.
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) }
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2817) }
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2835) system file descriptor limit: 1048576, proto-fd
-max: 15000
Nov 25 2014 10:59:02 GMT: INFO (cf:misc): (id.c::119) Node ip: 10.10.99.128
Nov 25 2014 10:59:02 GMT: INFO (cf:misc): (id.c::265) Heartbeat address for mesh: 10.10.99.128
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2891) Rack Aware mode not enabled
Nov 25 2014 10:59:02 GMT: INFO (config): (cfg.c::2894) Node id bb900000000008e
Nov 25 2014 10:59:02 GMT: INFO (namespace): (namespace_cold.c::101) ns test beginning COLD start
Nov 25 2014 10:59:02 GMT: INFO (drv_ssd): (drv_ssd.c::3732) Opened file /opt/aerospike/data/test.dat b
ytes 4294967296
Nov 25 2014 10:59:02 GMT: INFO (drv_ssd): (drv_ssd.c::1008) number of wblocks in allocator: 4096 wblo
ck 1048576
Nov 25 2014 10:59:02 GMT: INFO (drv_ssd): (drv_ssd.c::3368) namespace test: found all 1 devices fresh,
initializing to random 734689668767154745
Nov 25 2014 10:59:02 GMT: INFO (drv_ssd): (drv_ssd.c::973) ns test loading free & defrag queues
Nov 25 2014 10:59:02 GMT: INFO (drv_ssd): (drv_ssd.c::907) /opt/aerospike/data/test.dat init defrag pr
ofile: 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,0,0
Nov 25 2014 10:59:02 GMT: INFO (drv_ssd): (drv_ssd.c::997) /opt/aerospike/data/test.dat init wblock fr
ee-q 4095, defrag-q 0
Nov 25 2014 10:59:02 GMT: INFO (drv_ssd): (drv_ssd.c::2423) ns test starting device maintenance thread
s
Nov 25 2014 10:59:02 GMT: INFO (drv_ssd): (drv_ssd.c::1700) ns test starting write worker threads
Nov 25 2014 10:59:02 GMT: INFO (drv_ssd): (drv_ssd.c::824) ns test starting defrag threads
Nov 25 2014 10:59:02 GMT: INFO (as): (as.c::395) initializing services...
Nov 25 2014 10:59:02 GMT: INFO (tsvc): (thr_tsvc.c::998) shared queues: 4 queues with 4 threads each
Nov 25 2014 10:59:02 GMT: INFO (hb): (hb.c::2350) heartbeat socket initialization
Nov 25 2014 10:59:02 GMT: INFO (hb): (hb.c::2364) initializing mesh heartbeat socket : 10.10.99.128:30
02
Nov 25 2014 10:59:02 GMT: INFO (info): (thr_info.c::4893) static external network definition
Nov 25 2014 10:59:02 GMT: INFO (paxos): (paxos.c::2981) partitions from storage: total 4096 found 0 lo
st(set) 0 lost(unset) 4096
Nov 25 2014 10:59:02 GMT: INFO (config): (cluster_config.c::406) Rack Aware is disabled.
Nov 25 2014 10:59:02 GMT: INFO (partition): (cluster_config.c::368) Rack Aware is disabled.
Nov 25 2014 10:59:02 GMT: INFO (partition): (partition.c::2831) CLUSTER SIZE = 1
Nov 25 2014 10:59:02 GMT: INFO (partition): (partition.c::2870) Global state is well formed
Nov 25 2014 10:59:02 GMT: INFO (paxos): (partition.c::2524) setting replication factors: cluster size
1, paxos single replica limit 1
Nov 25 2014 10:59:02 GMT: INFO (paxos): (partition.c::2531) {test} replication factor is 1
Nov 25 2014 10:59:02 GMT: INFO (paxos): (partition.c::3780) global partition state: total 4096 lost 40
96 unique 0 duplicate 0
Nov 25 2014 10:59:02 GMT: INFO (paxos): (partition.c::3781) partition state after fixing lost partitio
ns (master): total 4096 lost 0 unique 4096 duplicate 0
Nov 25 2014 10:59:02 GMT: INFO (paxos): (partition.c::3782) 0 new partition version tree paths generat
ed
Nov 25 2014 10:59:02 GMT: INFO (partition): (partition.c::364) ALLOW MIGRATIONS
Nov 25 2014 10:59:02 GMT: INFO (paxos): (paxos.c::2986) Paxos service ignited: bb900000000008e
Nov 25 2014 10:59:03 GMT: INFO (scan): (thr_tscan.c::1866) started 32 threads
Nov 25 2014 10:59:03 GMT: INFO (batch): (thr_batch.c::334) Initialize 4 batch worker threads.
Nov 25 2014 10:59:03 GMT: INFO (drv_ssd): (drv_ssd.c::4083) {test} floor set at 41 wblocks per device
Nov 25 2014 10:59:08 GMT: INFO (paxos): (paxos.c::3048) paxos supervisor thread started
Nov 25 2014 10:59:08 GMT: INFO (hb): (hb.c::1882) connecting to remote heartbeat service at 10.10.99.1
29:3002
Nov 25 2014 10:59:08 GMT: INFO (nsup): (thr_nsup.c::1203) namespace supervisor started
Nov 25 2014 10:59:08 GMT: INFO (ldt): (thr_nsup.c::1160) LDT supervisor started
Nov 25 2014 10:59:08 GMT: INFO (demarshal): (thr_demarshal.c::221) Saved original JEMalloc arena #7 fo
r thr_demarshal()
Nov 25 2014 10:59:08 GMT: INFO (demarshal): (thr_demarshal.c::249) Service started: socket 3000
Nov 25 2014 10:59:09 GMT: INFO (demarshal): (thr_demarshal.c::221) Saved original JEMalloc arena #8 fo
r thr_demarshal()
Nov 25 2014 10:59:09 GMT: INFO (demarshal): (thr_demarshal.c::221) Saved original JEMalloc arena #9 fo
r thr_demarshal()
Nov 25 2014 10:59:09 GMT: INFO (demarshal): (thr_demarshal.c::221) Saved original JEMalloc arena #10 f
or thr_demarshal()
Nov 25 2014 10:59:10 GMT: INFO (hb): (hb.c::1088) mesh_list_service_fn: initiated connection to mesh h
ost at 10.10.99.129:3002 socket 63 from 10.10.99.129:3002
Nov 25 2014 10:59:10 GMT: INFO (demarshal): (thr_demarshal.c::718) Waiting to spawn demarshal threads
...
Nov 25 2014 10:59:10 GMT: INFO (demarshal): (thr_demarshal.c::721) Started 4 Demarshal Threads
Nov 25 2014 10:59:10 GMT: INFO (as): (as.c::433) service ready: soon there will be cake!
Nov 25 2014 10:59:20 GMT: INFO (info): (thr_info.c::4440) system memory: free 64904576kb ( 98 percent
free )
Nov 25 2014 10:59:20 GMT: INFO (info): (thr_info.c::4447) migrates in progress ( 0 , 0 ) ::: ClusterS
ize 1 ::: objects 0
Nov 25 2014 10:59:20 GMT: INFO (info): (thr_info.c::4455) rec refs 0 ::: rec locks 0 ::: trees 0 :::
wr reqs 0 ::: mig tx 0 ::: mig rx 0
Nov 25 2014 10:59:20 GMT: INFO (info): (thr_info.c::4461) replica errs :: null 0 non-null 0 ::: sync
copy errs :: node 0 :: master 0
Nov 25 2014 10:59:20 GMT: INFO (info): (thr_info.c::4471) trans_in_progress: wr 0 prox 0 wait 0 :::
q 0 ::: bq 0 ::: iq 0 ::: dq 0 : fds - proto (0, 0, 0) : hb (1, 1, 0) : fab (16, 16, 0)
Nov 25 2014 10:59:20 GMT: INFO (info): (thr_info.c::4473) heartbeat_received: self 67 : foreign 0
Nov 25 2014 10:59:20 GMT: INFO (info): (thr_info.c::4474) heartbeat_stats: bt 0 bf 0 nt 0 ni 0 nn 0
nnir 0 nal 0 sf1 0 sf2 0 sf3 0 sf4 0 sf5 0 sf6 0 mrf 0 eh 0 efd 0 efa 0 um 0
Nov 25 2014 10:59:20 GMT: INFO (info): (thr_info.c::4487) tree_counts: nsup 0 scan 0 batch 0 dup 0
wprocess 0 migrx 0 migtx 0 ssdr 0 ssdw 0 rw 0
Nov 25 2014 10:59:20 GMT: INFO (info): (thr_info.c::4503) namespace test: disk inuse: 0 memory inuse:
0 (bytes) sindex memory inuse: 0 (bytes) avail pct 99
Nov 25 2014 10:59:20 GMT: INFO (info): (thr_info.c::4528) partitions: actual 4096 sync 0 desync 0 z
ombie 0 wait 0 absent 0
Nov 25 2014 10:59:20 GMT: INFO (info): (hist.c::136) histogram dump: reads (0 total)
Nov 25 2014 10:59:20 GMT: INFO (info): (hist.c::136) histogram dump: writes_master (0 total)
Nov 25 2014 10:59:20 GMT: INFO (info): (hist.c::136) histogram dump: proxy (0 total)
Nov 25 2014 10:59:20 GMT: INFO (info): (hist.c::136) histogram dump: writes_reply (0 total)
Nov 25 2014 10:59:20 GMT: INFO (info): (hist.c::136) histogram dump: udf (0 total)
Nov 25 2014 10:59:20 GMT: INFO (info): (hist.c::136) histogram dump: query (0 total)
Nov 25 2014 10:59:20 GMT: INFO (info): (hist.c::136) histogram dump: query_rec_count (0 total)
Nov 25 2014 10:59:22 GMT: INFO (drv_ssd): (drv_ssd.c::2392) device /opt/aerospike/data/test.dat: used
0, contig-free 4095M (4095 wblocks), swb-free 0, w-q 0 w-tot 0 (0.0/s), defrag-q 0 defrag-tot 0 (0.0/s
)
starting on 10.10.99.129:
Nov 25 2014 10:58:46 GMT: INFO (as): (as.c::357) <><><><><><><><><><> Aerospike Community Edition bui
ld 3.3.17 <><><><><><><><><><>
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) # Aerospike database configuration file.
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) # This stanza must come first.
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) service {
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) user root
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) group root
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) paxos-single-replica-limit 1 # Number of nodes
where the replica count is automatically reduced to 1.
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) pidfile /var/run/aerospike/asd.pid
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) service-threads 4
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) transaction-queues 4
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) transaction-threads-per-queue 4
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) proto-fd-max 15000
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) }
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) logging {
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) # Log file must be an absolute path.
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) file /var/log/aerospike/aerospike.log {
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) context any info
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) }
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) # Send log messages to stdout
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) console {
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) context any info
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) }
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) }
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) network {
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) service {
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) address any
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) port 3000
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) # Uncomment the following to set the parameter
to the
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) # IP address of the Docker host. This will the
allow the server to correctly
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) # publish the address which applications and ot
her nodes in the cluster to
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) # use when addressing this node.
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) access-address 10.10.99.129
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) }
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) heartbeat {
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) # mesh is used for environments that do not sup
port multicast
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) mode mesh
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) port 3002
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) # use asinfo -v 'tip:host=<ADDR>;port=3002' to
inform cluster of
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) # other mesh nodes
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) mesh-address 10.10.99.129
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) mesh-port 3002
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) interval 150
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) timeout 20
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) }
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) fabric {
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) port 3001
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) }
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) info {
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) port 3003
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) }
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) }
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) namespace test {
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) replication-factor 2
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) memory-size 1G
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) default-ttl 5d # 30 days, use 0 to never expire
/evict.
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) # storage-engine memory
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817)
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) # To use file storage backing, comment out the
line above and use the
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) # following lines instead.
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) storage-engine device {
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) file /opt/aerospike/data/test.dat
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) filesize 4G
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) data-in-memory true # Store data in memory in a
ddition to file.
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) }
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2817) }
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2835) system file descriptor limit: 1048576, proto-fd
-max: 15000
Nov 25 2014 10:58:46 GMT: INFO (cf:misc): (id.c::119) Node ip: 10.10.99.129
Nov 25 2014 10:58:46 GMT: INFO (cf:misc): (id.c::265) Heartbeat address for mesh: 10.10.99.129
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2891) Rack Aware mode not enabled
Nov 25 2014 10:58:46 GMT: INFO (config): (cfg.c::2894) Node id bb900000000008e
Nov 25 2014 10:58:46 GMT: INFO (namespace): (namespace_cold.c::101) ns test beginning COLD start
Nov 25 2014 10:58:46 GMT: INFO (drv_ssd): (drv_ssd.c::3732) Opened file /opt/aerospike/data/test.dat b
ytes 4294967296
Nov 25 2014 10:58:46 GMT: INFO (drv_ssd): (drv_ssd.c::1008) number of wblocks in allocator: 4096 wblo
ck 1048576
Nov 25 2014 10:58:46 GMT: INFO (drv_ssd): (drv_ssd.c::3368) namespace test: found all 1 devices fresh,
initializing to random 14471008469813103845
Nov 25 2014 10:58:46 GMT: INFO (drv_ssd): (drv_ssd.c::973) ns test loading free & defrag queues
Nov 25 2014 10:58:46 GMT: INFO (drv_ssd): (drv_ssd.c::907) /opt/aerospike/data/test.dat init defrag pr
ofile: 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,0,0
Nov 25 2014 10:58:46 GMT: INFO (drv_ssd): (drv_ssd.c::997) /opt/aerospike/data/test.dat init wblock fr
ee-q 4095, defrag-q 0
Nov 25 2014 10:58:46 GMT: INFO (drv_ssd): (drv_ssd.c::2423) ns test starting device maintenance thread
s
Nov 25 2014 10:58:46 GMT: INFO (drv_ssd): (drv_ssd.c::1700) ns test starting write worker threads
Nov 25 2014 10:58:46 GMT: INFO (drv_ssd): (drv_ssd.c::824) ns test starting defrag threads
Nov 25 2014 10:58:46 GMT: INFO (as): (as.c::395) initializing services...
Nov 25 2014 10:58:46 GMT: INFO (tsvc): (thr_tsvc.c::998) shared queues: 4 queues with 4 threads each
Nov 25 2014 10:58:46 GMT: INFO (hb): (hb.c::2350) heartbeat socket initialization
Nov 25 2014 10:58:46 GMT: INFO (hb): (hb.c::2364) initializing mesh heartbeat socket : 10.10.99.129:30
02
Nov 25 2014 10:58:46 GMT: INFO (info): (thr_info.c::4893) static external network definition
Nov 25 2014 10:58:46 GMT: INFO (paxos): (paxos.c::2981) partitions from storage: total 4096 found 0 lo
st(set) 0 lost(unset) 4096
Nov 25 2014 10:58:46 GMT: INFO (config): (cluster_config.c::406) Rack Aware is disabled.
Nov 25 2014 10:58:46 GMT: INFO (partition): (cluster_config.c::368) Rack Aware is disabled.
Nov 25 2014 10:58:46 GMT: INFO (partition): (partition.c::2831) CLUSTER SIZE = 1
Nov 25 2014 10:58:46 GMT: INFO (partition): (partition.c::2870) Global state is well formed
Nov 25 2014 10:58:46 GMT: INFO (paxos): (partition.c::2524) setting replication factors: cluster size
1, paxos single replica limit 1
Nov 25 2014 10:58:46 GMT: INFO (paxos): (partition.c::2531) {test} replication factor is 1
Nov 25 2014 10:58:46 GMT: INFO (paxos): (partition.c::3780) global partition state: total 4096 lost 40
96 unique 0 duplicate 0
Nov 25 2014 10:58:46 GMT: INFO (paxos): (partition.c::3781) partition state after fixing lost partitio
ns (master): total 4096 lost 0 unique 4096 duplicate 0
Nov 25 2014 10:58:46 GMT: INFO (paxos): (partition.c::3782) 0 new partition version tree paths generat
ed
Nov 25 2014 10:58:46 GMT: INFO (partition): (partition.c::364) ALLOW MIGRATIONS
Nov 25 2014 10:58:46 GMT: INFO (paxos): (paxos.c::2986) Paxos service ignited: bb900000000008e
Nov 25 2014 10:58:47 GMT: INFO (scan): (thr_tscan.c::1866) started 32 threads
Nov 25 2014 10:58:47 GMT: INFO (batch): (thr_batch.c::334) Initialize 4 batch worker threads.
Nov 25 2014 10:58:47 GMT: INFO (drv_ssd): (drv_ssd.c::4083) {test} floor set at 41 wblocks per device
Nov 25 2014 10:58:52 GMT: INFO (paxos): (paxos.c::3048) paxos supervisor thread started
Nov 25 2014 10:58:52 GMT: INFO (hb): (hb.c::1882) connecting to remote heartbeat service at 10.10.99.1
29:3002
Nov 25 2014 10:58:52 GMT: INFO (nsup): (thr_nsup.c::1203) namespace supervisor started
Nov 25 2014 10:58:52 GMT: INFO (ldt): (thr_nsup.c::1160) LDT supervisor started
Nov 25 2014 10:58:52 GMT: INFO (demarshal): (thr_demarshal.c::221) Saved original JEMalloc arena #7 fo
r thr_demarshal()
Nov 25 2014 10:58:52 GMT: INFO (demarshal): (thr_demarshal.c::249) Service started: socket 3000
Nov 25 2014 10:58:53 GMT: INFO (demarshal): (thr_demarshal.c::221) Saved original JEMalloc arena #8 fo
r thr_demarshal()
Nov 25 2014 10:58:53 GMT: INFO (demarshal): (thr_demarshal.c::221) Saved original JEMalloc arena #9 fo
r thr_demarshal()
Nov 25 2014 10:58:53 GMT: INFO (demarshal): (thr_demarshal.c::221) Saved original JEMalloc arena #10 f
or thr_demarshal()
Nov 25 2014 10:58:54 GMT: INFO (demarshal): (thr_demarshal.c::718) Waiting to spawn demarshal threads
...
Nov 25 2014 10:58:54 GMT: INFO (demarshal): (thr_demarshal.c::721) Started 4 Demarshal Threads
Nov 25 2014 10:58:54 GMT: INFO (as): (as.c::433) service ready: soon there will be cake!
Nov 25 2014 10:59:04 GMT: INFO (info): (thr_info.c::4440) system memory: free 43860280kb ( 66 percent
free )
Nov 25 2014 10:59:04 GMT: INFO (info): (thr_info.c::4447) migrates in progress ( 0 , 0 ) ::: ClusterS
ize 1 ::: objects 0
Nov 25 2014 10:59:04 GMT: INFO (info): (thr_info.c::4455) rec refs 0 ::: rec locks 0 ::: trees 0 :::
wr reqs 0 ::: mig tx 0 ::: mig rx 0
Nov 25 2014 10:59:04 GMT: INFO (info): (thr_info.c::4461) replica errs :: null 0 non-null 0 ::: sync
copy errs :: node 0 :: master 0
Nov 25 2014 10:59:04 GMT: INFO (info): (thr_info.c::4471) trans_in_progress: wr 0 prox 0 wait 0 :::
q 0 ::: bq 0 ::: iq 0 ::: dq 0 : fds - proto (0, 0, 0) : hb (1, 1, 0) : fab (16, 16, 0)
Nov 25 2014 10:59:04 GMT: INFO (info): (thr_info.c::4473) heartbeat_received: self 53 : foreign 0
Nov 25 2014 10:59:04 GMT: INFO (info): (thr_info.c::4474) heartbeat_stats: bt 0 bf 0 nt 0 ni 0 nn 0
nnir 0 nal 0 sf1 0 sf2 0 sf3 0 sf4 0 sf5 0 sf6 0 mrf 0 eh 0 efd 0 efa 0 um 0
Nov 25 2014 10:59:04 GMT: INFO (info): (thr_info.c::4487) tree_counts: nsup 0 scan 0 batch 0 dup 0
wprocess 0 migrx 0 migtx 0 ssdr 0 ssdw 0 rw 0
Nov 25 2014 10:59:04 GMT: INFO (info): (thr_info.c::4503) namespace test: disk inuse: 0 memory inuse:
0 (bytes) sindex memory inuse: 0 (bytes) avail pct 99
Nov 25 2014 10:59:04 GMT: INFO (info): (thr_info.c::4528) partitions: actual 4096 sync 0 desync 0 z
ombie 0 wait 0 absent 0
Nov 25 2014 10:59:04 GMT: INFO (info): (hist.c::136) histogram dump: reads (0 total)
Nov 25 2014 10:59:04 GMT: INFO (info): (hist.c::136) histogram dump: writes_master (0 total)
Nov 25 2014 10:59:04 GMT: INFO (info): (hist.c::136) histogram dump: proxy (0 total)
Nov 25 2014 10:59:04 GMT: INFO (info): (hist.c::136) histogram dump: writes_reply (0 total)
Nov 25 2014 10:59:04 GMT: INFO (info): (hist.c::136) histogram dump: udf (0 total)
Nov 25 2014 10:59:04 GMT: INFO (info): (hist.c::136) histogram dump: query (0 total)
Nov 25 2014 10:59:04 GMT: INFO (info): (hist.c::136) histogram dump: query_rec_count (0 total)
Nov 25 2014 10:59:06 GMT: INFO (drv_ssd): (drv_ssd.c::2392) device /opt/aerospike/data/test.dat: used
0, contig-free 4095M (4095 wblocks), swb-free 0, w-q 0 w-tot 0 (0.0/s), defrag-q 0 defrag-tot 0 (0.0/s
)
Nov 25 2014 10:59:14 GMT: INFO (info): (thr_info.c::4440) system memory: free 43858352kb ( 66 percent
free )
Nov 25 2014 10:59:14 GMT: INFO (info): (thr_info.c::4447) migrates in progress ( 0 , 0 ) ::: ClusterS
ize 1 ::: objects 0
Nov 25 2014 10:59:14 GMT: INFO (info): (thr_info.c::4455) rec refs 0 ::: rec locks 0 ::: trees 0 :::
wr reqs 0 ::: mig tx 0 ::: mig rx 0
Nov 25 2014 10:59:14 GMT: INFO (info): (thr_info.c::4461) replica errs :: null 0 non-null 0 ::: sync
copy errs :: node 0 :: master 0
Nov 25 2014 10:59:14 GMT: INFO (info): (thr_info.c::4471) trans_in_progress: wr 0 prox 0 wait 0 :::
q 0 ::: bq 0 ::: iq 0 ::: dq 0 : fds - proto (0, 0, 0) : hb (1, 1, 0) : fab (16, 16, 0)
Nov 25 2014 10:59:14 GMT: INFO (info): (thr_info.c::4473) heartbeat_received: self 120 : foreign 0
Nov 25 2014 10:59:14 GMT: INFO (info): (thr_info.c::4474) heartbeat_stats: bt 0 bf 0 nt 0 ni 0 nn 0
nnir 0 nal 0 sf1 0 sf2 0 sf3 0 sf4 0 sf5 0 sf6 0 mrf 0 eh 0 efd 0 efa 0 um 0
Nov 25 2014 10:59:14 GMT: INFO (info): (thr_info.c::4487) tree_counts: nsup 0 scan 0 batch 0 dup 0
wprocess 0 migrx 0 migtx 0 ssdr 0 ssdw 0 rw 0
Nov 25 2014 10:59:14 GMT: INFO (info): (thr_info.c::4503) namespace test: disk inuse: 0 memory inuse:
0 (bytes) sindex memory inuse: 0 (bytes) avail pct 99
Nov 25 2014 10:59:14 GMT: INFO (info): (thr_info.c::4528) partitions: actual 4096 sync 0 desync 0 z
ombie 0 wait 0 absent 0
Nov 25 2014 10:59:14 GMT: INFO (info): (hist.c::136) histogram dump: reads (0 total)
Nov 25 2014 10:59:14 GMT: INFO (info): (hist.c::136) histogram dump: writes_master (0 total)
Nov 25 2014 10:59:14 GMT: INFO (info): (hist.c::136) histogram dump: proxy (0 total)
Nov 25 2014 10:59:14 GMT: INFO (info): (hist.c::136) histogram dump: writes_reply (0 total)
Nov 25 2014 10:59:14 GMT: INFO (info): (hist.c::136) histogram dump: udf (0 total)
Nov 25 2014 10:59:14 GMT: INFO (info): (hist.c::136) histogram dump: query (0 total)
Nov 25 2014 10:59:14 GMT: INFO (info): (hist.c::136) histogram dump: query_rec_count (0 total)
Nov 25 2014 10:59:24 GMT: INFO (info): (thr_info.c::4440) system memory: free 43852036kb ( 66 percent
free )
Nov 25 2014 10:59:24 GMT: INFO (info): (thr_info.c::4447) migrates in progress ( 0 , 0 ) ::: ClusterS
ize 1 ::: objects 0
Nov 25 2014 10:59:24 GMT: INFO (info): (thr_info.c::4455) rec refs 0 ::: rec locks 0 ::: trees 0 :::
wr reqs 0 ::: mig tx 0 ::: mig rx 0
Nov 25 2014 10:59:24 GMT: INFO (info): (thr_info.c::4461) replica errs :: null 0 non-null 0 ::: sync
copy errs :: node 0 :: master 0
Nov 25 2014 10:59:24 GMT: INFO (info): (thr_info.c::4471) trans_in_progress: wr 0 prox 0 wait 0 :::
q 0 ::: bq 0 ::: iq 0 ::: dq 0 : fds - proto (0, 0, 0) : hb (1, 1, 0) : fab (16, 16, 0)
Nov 25 2014 10:59:24 GMT: INFO (info): (thr_info.c::4473) heartbeat_received: self 186 : foreign 0
Nov 25 2014 10:59:24 GMT: INFO (info): (thr_info.c::4474) heartbeat_stats: bt 0 bf 0 nt 0 ni 0 nn 0
nnir 0 nal 0 sf1 0 sf2 0 sf3 0 sf4 0 sf5 0 sf6 0 mrf 0 eh 0 efd 0 efa 0 um 0
Nov 25 2014 10:59:24 GMT: INFO (info): (thr_info.c::4487) tree_counts: nsup 0 scan 0 batch 0 dup 0
wprocess 0 migrx 0 migtx 0 ssdr 0 ssdw 0 rw 0
Nov 25 2014 10:59:24 GMT: INFO (info): (thr_info.c::4503) namespace test: disk inuse: 0 memory inuse:
0 (bytes) sindex memory inuse: 0 (bytes) avail pct 99
Nov 25 2014 10:59:24 GMT: INFO (info): (thr_info.c::4528) partitions: actual 4096 sync 0 desync 0 z
ombie 0 wait 0 absent 0
By the way, I can set the elasticsearch cluster in the docker success on the same machine and the elasticsearch is using the unicast which I think should be work the same as aerospike.
There seems to still be something still preventing communication from the containers.
Nov 25 2014 10:59:24 GMT: INFO (info): (thr_info.c::4473) heartbeat_received: self 186 : foreign 0
heartbeat received shows 0 foreign heartbeat.
Any firewall enabled on these hosts? or routing rules affecting containers?
Please see section on Firewalld: https://docs.docker.com/installation/centos/
Could you also provide your docker run command line ? From an aerospike config point of view things look ok. Only suggestion would be to use a ring configuration by setting mesh-address on 10.10.99.129 to mesh-address 10.10.99.128
and also try running the "tip" command to inform a node of the other:
asinfo -v 'tip:host=10.10.99.128;port=3002' -h 10.10.99.129
I have successfully tested the config on EC2, ubuntu and centos7 with latest docker.
firewall has been disabled on both machines and the docker daemon has already been restarted after the firewalld stopped.
[root@server-128 ~]# systemctl status firewalld
firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled)
Active: inactive (dead)
the iptables --list showed on 128 is:
[root@server-128 ~]# iptables --list
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
ACCEPT tcp -- anywhere 172.17.0.2 tcp dpt:webcache
ACCEPT tcp -- anywhere 172.17.0.2 tcp dpt:webcache
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
ACCEPT all -- anywhere anywhere
ACCEPT all -- anywhere anywhere
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
on 129 is:
[root@Server-129 ~]# iptables --list
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
ACCEPT tcp -- anywhere 172.17.0.3 tcp dpt:webcache
ACCEPT tcp -- anywhere 172.17.0.2 tcp dpt:28888
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
ACCEPT all -- anywhere anywhere
ACCEPT all -- anywhere anywhere
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
ifconfig on 128 is:
docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.17.42.1 netmask 255.255.0.0 broadcast 0.0.0.0
inet6 fe80::5484:7aff:fefe:9799 prefixlen 64 scopeid 0x20<link>
ether 56:84:7a:fe:97:99 txqueuelen 0 (Ethernet)
RX packets 3445242 bytes 49508719758 (46.1 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 2309301 bytes 178943189 (170.6 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
em1: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
ether 90:b1:1c:15:99:26 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
em2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.10.99.128 netmask 255.0.0.0 broadcast 10.255.255.255
inet6 fe80::92b1:1cff:fe15:9927 prefixlen 64 scopeid 0x20<link>
ether 90:b1:1c:15:99:27 txqueuelen 1000 (Ethernet)
RX packets 14029861 bytes 4491284566 (4.1 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 11214358 bytes 9265648327 (8.6 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 0 (Local Loopback)
RX packets 2989101 bytes 49678355442 (46.2 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 2989101 bytes 49678355442 (46.2 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
vethf7910cc: flags=67<UP,BROADCAST,RUNNING> mtu 1500
inet6 fe80::ecf1:6ff:fe96:6808 prefixlen 64 scopeid 0x20<link>
ether ee:f1:06:96:68:08 txqueuelen 1000 (Ethernet)
RX packets 3737 bytes 44649641 (42.5 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 2390 bytes 179217 (175.0 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
on 129 is:
docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
inet 172.17.42.1 netmask 255.255.0.0 broadcast 0.0.0.0
inet6 fe80::5484:7aff:fefe:9799 prefixlen 64 scopeid 0x20<link>
ether 56:84:7a:fe:97:99 txqueuelen 0 (Ethernet)
RX packets 3311409 bytes 45980969525 (42.8 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 2271112 bytes 205991240 (196.4 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
em1: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
ether 90:b1:1c:13:1c:ae txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
em2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.10.99.129 netmask 255.0.0.0 broadcast 10.255.255.255
inet6 fe80::92b1:1cff:fe13:1caf prefixlen 64 scopeid 0x20<link>
ether 90:b1:1c:13:1c:af txqueuelen 1000 (Ethernet)
RX packets 19057394 bytes 10493478930 (9.7 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 12508815 bytes 9065484964 (8.4 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 0 (Local Loopback)
RX packets 3964215 bytes 46304781725 (43.1 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 3964215 bytes 46304781725 (43.1 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
docker run command is:
docker run -tid --privileged --net=host -e HOST_IP=10.10.99.129 10.10.99.131:5000/dockercloud/aerospike-server
I tried the tip command and seems no help.
From logs It indeed seems the heartbeat has some problem. However, I can not find out why this happen.
Could you provide your Dockerfile and your aerospike.conf file so I can try to replicate. Also could you try running it without using environment variables "-e" to rule out any bugs.
Thanks
Lucien
Vincent,
One more change on centos7 you may need to specify the interface name. Please add to your network service stanza :
service {
address any
port 3000
network-interface-name em2
access-address 10.10.99.128
}
--Lucien
Oh, it is working like a charm after adding
network-interface-name em2
in the configure. But why would this matter? Or could it auto check this by comparing with the access-address?
Anyway, Thank you very much for the greate help !
Happy to be of service. This is a bug that I'm going to follow up on.
best, Lucien
Ok, got it. I'd like to do the test again after the fixing.
We'll update once fixed.
best, Lucien
To allow the access from outside and allow the server in docker on different hosts to communicate with each other, I set the access-address to the Docker host ip. However, it failed to start. The config and logs as below:
, the failed log is:
Nov 18 2014 08:38:09 GMT: INFO (as): (as.c::408) initializing services... Nov 18 2014 08:38:09 GMT: INFO (drv_ssd): (drv_ssd.c::1666) ns test starting write worker threads Nov 18 2014 08:38:09 GMT: INFO (tsvc): (thr_tsvc.c::998) shared queues: 4 queues with 4 threads each Nov 18 2014 08:38:09 GMT: INFO (drv_ssd): (drv_ssd.c::811) ns test starting defrag threads Nov 18 2014 08:38:09 GMT: INFO (hb): (hb.c::2434) heartbeat socket initialization Nov 18 2014 08:38:09 GMT: INFO (hb): (hb.c::2448) initializing mesh heartbeat socket : 0.0.0.0:3002 Nov 18 2014 08:38:09 GMT: INFO (info): (thr_info.c::4940) static external network definition Nov 18 2014 08:38:09 GMT: CRITICAL (info): (thr_info.c:info_interfaces_static_fn:4954) external address:10.10.99.128 is not matching with any of service addresses:172.17.0.7:3000 Nov 18 2014 08:38:09 GMT: WARNING (as): (signal.c::70) SIGABRT received, aborting Aerospike Community Edition build 3.3.21