goraft / raft

UNMAINTAINED: A Go implementation of the Raft distributed consensus protocol.
MIT License
2.43k stars 480 forks source link

Add 'Raft in Practice' to README. #124

Closed benbjohnson closed 11 years ago

benbjohnson commented 11 years ago

Per an e-mail conversation with @pvo, I added a section to the documentation called Raft in Practice.

@pvo Can you tell me if this helps to answer your question? Is there anything I'm missing?

@ongardie @kellabyte @xiangli-cmu @philips Can you guys give me a technical review? It's not very long.

Here's the pretty printed version.

philips commented 11 years ago

@benbjohnson One style thing I have started doing in markdown files is one sentence per line. It is compatible with all of the editors and makes reviewing in git and github easier. Just a suggestion; no strong feelings either way.

philips commented 11 years ago

lgtm, super helpful.

benbjohnson commented 11 years ago

@philips I prefer that too actually. I started using the Zen Mode for editing right on GitHub and it's been nice but it does one long line. Also, I read that GFM adds line breaks for multiple lines but I guess .md is just straight markdown (ref).

kellabyte commented 11 years ago

"we've found the maximum effective cluster size to be around 9 nodes. We typically suggest a 5 node cluster for performance reasons though."

Might be worthwhile giving some context as to what the setup configuration was. Are these 9 and 5 node clusters on the same LAN or across WAN? What was the ping like between them?

I know there are a lot of variables at play but I think its good to give some idea of the environment the suggestion is coming from.

On Fri, Oct 25, 2013 at 12:51 PM, Ben Johnson notifications@github.comwrote:

Per an e-mail conversation with @pvo https://github.com/pvo, I added a section to the documentation called Raft in Practice.

@pvo https://github.com/pvo Can you tell me if this helps to answer your question? Is there anything I'm missing?

@ongardie https://github.com/ongardie @kellabytehttps://github.com/kellabyte @xiangli-cmu https://github.com/xiangli-cmu @philipshttps://github.com/philipsCan you guys give me a technical review? It's not very long.

Here's the pretty printed versionhttps://github.com/goraft/raft/blob/docs/README.md#raft-in-practice

.

You can merge this Pull Request by running

git pull https://github.com/goraft/raft docs

Or view, comment on, or merge it at:

https://github.com/goraft/raft/pull/124 Commit Summary

  • Add 'Raft in Practice' to README.

File Changes

Patch Links:

benbjohnson commented 11 years ago

@kellabyte Good point. The overhead is more related to message processing time and not necessarily latency. I'll make it more descriptive.

philips commented 11 years ago

@benbjohnson Another thing to consider is even vs odd number of nodes. e.g. https://github.com/coreos/etcd/issues/149#issuecomment-23603009

ongardie commented 11 years ago

Looks pretty good, but in cluster size, I don't think heartbeat chattiness is the biggest concern. Even if your heartbeats were 500 bytes, you had 101 servers, and your heartbeats went out every 50ms, that's still less than 1% of a gigabit link out from the leader.

I think the main reason to make a cluster larger is to tolerate more server failures before a human has to get involved (your Concurrent Node Failures). For this reason, I can't imagine needing a cluster bigger than 9 servers, allowing 4 of them to fail independently before any are replaced.

I think the main reasons to keep a cluster small are (a) cost, (b) latency/bandwidth for new entries coming in to the leader, since the leader has to replicate each one out to a majority of the cluster, and (c) if you really do run with a ton of servers, you're more likely that candidates will interfere with each other during elections, so you might need to increase your election timeouts (baseline and range).

pvo commented 11 years ago

Thanks all. Super helpful.

benbjohnson commented 11 years ago

@ongardie Good point. I updated the README to be mainly around node failure tolerance. I wanted to keep it simple so I left out cost, latency, & election conflicts. Let me know what you think.

ongardie commented 11 years ago

shipit

xiang90 commented 11 years ago

@ongardie For the cluster size, in my reply I was comparing 8 and 9 rather than 5 and 9. Also, if having a lot of nodes, we probably need to cache the serialization result, which we have not done yet.