tidwall / summitdb

In-memory NoSQL database with ACID transactions, Raft consensus, and Redis API
Other
1.41k stars 78 forks source link

Can't join cluster: "peer already known" #23

Open didasy opened 6 years ago

didasy commented 6 years ago

I tried summitdb in docker swarm, I first create a master service:

docker service create --name summitdb-master --network redis didasy/summitdb

Then I created a slave service

docker service create --name summitdb-slave --network redis didasy/summitdb -join summitdb-master:7481

The slave service won't go live, and when I check the log in the container it says

1:M 16 Dec 13:50:08.292 * SummitDB 0.4.0
1:N 16 Dec 13:50:08.303 * Node at :7481 [Follower] entering Follower state (Leader: "")
1:N 16 Dec 13:50:08.305 # failed to join node at summitdb-master:7481: peer already known

And this is from the master

1:M 16 Dec 13:40:30.375 * SummitDB 0.4.0
1:N 16 Dec 13:40:30.379 * Enable single node
1:N 16 Dec 13:40:30.385 * Node at :7481 [Follower] entering Follower state (Leader: "")
1:N 16 Dec 13:40:31.860 # Heartbeat timeout from "" reached, starting election
1:N 16 Dec 13:40:31.860 * Node at :7481 [Candidate] entering Candidate state
1:N 16 Dec 13:40:31.863 * Election won. Tally: 1
1:N 16 Dec 13:40:31.863 * Node at :7481 [Leader] entering Leader state
1:N 16 Dec 13:41:06.715 * Received add peer request from :7481
1:N 16 Dec 13:41:12.989 * Received add peer request from :7481
1:N 16 Dec 13:41:18.775 * Received add peer request from :7481
1:N 16 Dec 13:41:24.787 * Received add peer request from :7481
1:N 16 Dec 13:41:30.614 * Received add peer request from :7481
1:N 16 Dec 13:41:36.400 * Received add peer request from :7481
1:N 16 Dec 13:41:42.387 * Received add peer request from :7481
1:N 16 Dec 13:41:48.273 * Received add peer request from :7481
1:N 16 Dec 13:41:54.142 * Received add peer request from :7481
1:N 16 Dec 13:41:59.856 * Received add peer request from :7481
1:N 16 Dec 13:42:06.035 * Received add peer request from :7481
1:N 16 Dec 13:42:12.315 * Received add peer request from :7481
1:N 16 Dec 13:42:18.084 * Received add peer request from :7481
1:N 16 Dec 13:42:23.928 * Received add peer request from :7481
1:N 16 Dec 13:42:30.097 * Received add peer request from :7481
1:N 16 Dec 13:42:36.040 * Received add peer request from :7481
1:N 16 Dec 13:42:42.056 * Received add peer request from :7481
1:N 16 Dec 13:42:48.084 * Received add peer request from :7481
1:N 16 Dec 13:42:53.991 * Received add peer request from :7481
1:N 16 Dec 13:43:00.307 * Received add peer request from :7481
1:N 16 Dec 13:43:06.486 * Received add peer request from :7481
1:N 16 Dec 13:43:12.698 * Received add peer request from :7481
1:N 16 Dec 13:43:18.652 * Received add peer request from :7481
1:N 16 Dec 13:50:02.489 * Received add peer request from :7481
1:N 16 Dec 13:50:08.305 * Received add peer request from :7481
1:N 16 Dec 13:50:13.995 * Received add peer request from :7481
1:N 16 Dec 13:50:19.776 * Received add peer request from :7481
1:N 16 Dec 13:50:26.151 * Received add peer request from :7481
1:N 16 Dec 13:50:32.171 * Received add peer request from :7481
1:N 16 Dec 13:50:38.239 * Received add peer request from :7481
1:N 16 Dec 13:50:44.169 * Received add peer request from :7481
1:N 16 Dec 13:50:50.178 * Received add peer request from :7481
1:N 16 Dec 13:50:56.440 * Received add peer request from :7481
1:N 16 Dec 13:51:02.211 * Received add peer request from :7481
1:N 16 Dec 13:51:08.136 * Received add peer request from :7481
1:N 16 Dec 13:51:14.125 * Received add peer request from :7481
1:N 16 Dec 13:51:20.501 * Received add peer request from :7481
1:N 16 Dec 13:51:26.751 * Received add peer request from :7481
1:N 16 Dec 13:51:32.841 * Received add peer request from :7481
1:N 16 Dec 13:51:39.070 * Received add peer request from :7481
1:N 16 Dec 13:51:45.378 * Received add peer request from :7481
1:N 16 Dec 13:51:51.512 * Received add peer request from :7481
1:N 16 Dec 13:51:57.270 * Received add peer request from :7481
1:N 16 Dec 13:52:03.060 * Received add peer request from :7481
1:N 16 Dec 13:52:08.764 * Received add peer request from :7481
1:N 16 Dec 13:52:14.560 * Received add peer request from :7481
1:N 16 Dec 13:52:20.328 * Received add peer request from :7481
1:N 16 Dec 13:52:26.403 * Received add peer request from :7481
1:N 16 Dec 13:52:32.108 * Received add peer request from :7481
1:N 16 Dec 13:52:38.341 * Received add peer request from :7481
1:N 16 Dec 13:52:44.404 * Received add peer request from :7481
1:N 16 Dec 13:52:50.498 * Received add peer request from :7481
1:N 16 Dec 13:52:56.272 * Received add peer request from :7481
1:N 16 Dec 13:53:02.023 * Received add peer request from :7481
1:N 16 Dec 13:53:08.048 * Received add peer request from :7481
1:N 16 Dec 13:53:13.750 * Received add peer request from :7481
1:N 16 Dec 13:53:19.464 * Received add peer request from :7481
1:N 16 Dec 13:53:25.150 * Received add peer request from :7481
1:N 16 Dec 13:53:30.980 * Received add peer request from :7481
1:N 16 Dec 13:53:36.856 * Received add peer request from :7481
1:N 16 Dec 13:53:42.812 * Received add peer request from :7481
1:N 16 Dec 13:53:48.662 * Received add peer request from :7481
1:N 16 Dec 13:53:54.494 * Received add peer request from :7481
1:N 16 Dec 13:54:00.571 * Received add peer request from :7481
1:N 16 Dec 13:54:06.400 * Received add peer request from :7481
1:N 16 Dec 13:54:12.089 * Received add peer request from :7481
komuw commented 6 years ago

I'm also facing a similar error. I was using this docker-compose file, https://github.com/komuW/kshaka/blob/master/docker-compose.yml

Another common error is, leadership lost while committing log

On Dec 16, 2017 4:50 PM, "Andida Syahendar" notifications@github.com wrote:

I tried summitdb in docker swarm, I first create a master service:

docker service create --name summitdb-master --network redis didasy/summitdb

Then I created a slave service

docker service create --name summitdb-slave --network redis didasy/summitdb -join summitdb-master:7481

The slave service won't go live, and when I check the log in the container it says

1:M 16 Dec 13:50:08.292 SummitDB 0.4.0 1:N 16 Dec 13:50:08.303 Node at :7481 [Follower] entering Follower state (Leader: "") 1:N 16 Dec 13:50:08.305 # failed to join node at summitdb-master:7481: peer already known

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tidwall/summitdb/issues/23, or mute the thread https://github.com/notifications/unsubscribe-auth/AE7LURcJEBsyoYWyr4BaTHSExbJxhuKlks5tA8qmgaJpZM4REVyl .

didasy commented 5 years ago

Now I understand the error.

SummitDB by default uses "localhost:7481" if you do not supply -host and -port flags, these will be sent to finn.Open which uses net.ResolveTCPAddress which will lookup localhost on the server, and in a Docker container, this will return :7481

No wonder the slaves will tell us "peer already known" because ":7481" is their own address and master will see the peer request from their own address.

I am still finding out how to get around this.

One such solution is probably uses ifconfig eth0 | grep "inet addr:" | cut -d : -f 2 | cut -d " " -f 1 to get the container's IP address as -host argument, and set this up in Dockerfile.