Closed mradamczyk closed 10 years ago
AFAIK all data have to go thru Raft anyway, so you wont be saving much on that
Just put it behind HAProxy (we use "leasconn" policy with persistent connections)
@XANi: What do you mean by going thru Raft? All data goes to the Raft leader? If yes - why? I could only imagine that at most timestamp + name might go through RAFT to choose a shard space (with consensus) where the data should be written.
Could anyone write a post / article / part of documentation how InfluxDB uses RAFT algorithm?
@XANi: So load balancing with HAProxy helped you with handling big amount of writes?
@mradamczyk I might've misread https://github.com/influxdb/influxdb/pull/689 :)
It helped me in 0.7 because 0.7 was stalling on WAL writes and basically above certain amount of writes influxdb switched to writing to WAL only, because it was badly tuned for non-SSDs and while disks were 80% idle, it still decided leveldb lags too much. I didn't test if it is neccessary in 0.8.
What it does do is give client single IP (keepalived vIP between machines) to access and automatically sends request to working nodes (and stops sending to ones down if you configure your healthcheck correctly)
We figured out that one node is not able to handle all our writes. We think about writing to all nodes in cycle (round-robin), but it would be even better to write points with particular namespace to the node, where appriopriate shard space exists - just to avoid resending data inside the cluster.
Is it possible to ask influxdb to which node given data should be send? Then we could cache this information and reduce some networking.