Add cluster leader election

woodsaj commented 8 years ago

Issue by woodsaj Friday Jun 19, 2015 at 20:22 GMT Originally opened as https://github.com/raintank/grafana/issues/228

There are a few features within the code base that should only be run from one node at a time. This requires having the nodes co-ordinate this role amongst themselves.

Raft seems to be the new hotness when it comes to these things, so we should use that. Coreos' etcd package has an implementation of raft. https://godoc.org/github.com/coreos/etcd/raft

woodsaj commented 8 years ago

Comment by Dieterbe Tuesday Jun 23, 2015 at 00:06 GMT

1) mind sharing a little bit what those features are? 2) can we get away with transactions on the database? 3) for alerting, i noticed you mentioned somewhere running only 1 job producer, but i thought we decided we actually wanted to run multiple alert job producers for HA, because if jobs get consistently routed (by key), the consumers will drop jobs they've already processed anyway. this is a fairly simplistic method of HA. if you're thinking of running only 1 producer, and it dies and restarts somewhere else then we also need to keep track of the last timestamp at which jobs were scheduled. in case it takes several seconds to restart a producer, the new producer should also process the missed ticks from the last few seconds. (i actually like this approach, it seems more efficient, but also requires more operations/automation, perhaps we should postpone this improvement until we're at a point where multiple producers bring too much overhead?)

woodsaj commented 8 years ago

Comment by woodsaj Tuesday Jun 23, 2015 at 15:10 GMT

This is a long term goal to meet future scalability needs.

1) alerting scheduler and also collector session management.

2) yes, and that is likely what will be deployed first. but it does not scale. So long term we need a better solution.

3) also true, but as with 2, does not scale. If we are running 10 instances of grafana, we don't want to have all 10 pushing the same messages into the queue.

woodsaj commented 8 years ago

Comment by Dieterbe Tuesday Jun 23, 2015 at 19:59 GMT

This is a long term goal to meet future scalability needs.

I believe @nopzor1200 described the raft leader election as a high-prio item that was a must before we can launch.

I agree with your reasoning @woodsaj but we should make sure we're on the same page regarding urgency and timeline of this. also, to make this viable I will need to make the alerting scheduler stateful (keeping track of last successfully processed timestamp, perhaps this could go into the raft log or in etcd, or in the database. will we have a HA transactional database?)

woodsaj commented 8 years ago

Comment by Dieterbe Wednesday Jul 08, 2015 at 02:30 GMT

Just saw a docker talk at pre gophercon party about libkv which provides a nice abstraction for leader election (supports etcd, consul and zk)

woodsaj commented 8 years ago

Comment by nopzor1200 Saturday Jul 18, 2015 at 05:23 GMT

I originally misunderstood whether this was a high prio vs low prio item @woodsaj confirm it (raft or the like) is not something we need to worry about for now right?

woodsaj commented 8 years ago

Comment by woodsaj Sunday Jul 19, 2015 at 13:24 GMT

This is low prio.

woodsaj commented 8 years ago

Comment by Dieterbe Friday Jul 31, 2015 at 15:54 GMT

(interestingly, this ticket was in "to do" in codetree. when i moved it to backlog it removed the backlog milestone. i guess cause it doesn't use milestones for backlog.

raintank / worldping-api

Add cluster leader election #6