Closed woodsaj closed 8 years ago
Comment by Dieterbe Tuesday Jun 23, 2015 at 00:06 GMT
1) mind sharing a little bit what those features are? 2) can we get away with transactions on the database? 3) for alerting, i noticed you mentioned somewhere running only 1 job producer, but i thought we decided we actually wanted to run multiple alert job producers for HA, because if jobs get consistently routed (by key), the consumers will drop jobs they've already processed anyway. this is a fairly simplistic method of HA. if you're thinking of running only 1 producer, and it dies and restarts somewhere else then we also need to keep track of the last timestamp at which jobs were scheduled. in case it takes several seconds to restart a producer, the new producer should also process the missed ticks from the last few seconds. (i actually like this approach, it seems more efficient, but also requires more operations/automation, perhaps we should postpone this improvement until we're at a point where multiple producers bring too much overhead?)
Issue by woodsaj Friday Jun 19, 2015 at 20:22 GMT Originally opened as https://github.com/raintank/grafana/issues/228
There are a few features within the code base that should only be run from one node at a time. This requires having the nodes co-ordinate this role amongst themselves.
Raft seems to be the new hotness when it comes to these things, so we should use that. Coreos' etcd package has an implementation of raft. https://godoc.org/github.com/coreos/etcd/raft