Open ecsumed opened 5 years ago
There is #19 as well. I don't think this is what you are hitting, but might be useful for context and patches. I used to spend a lot of time with Graphite, less so now. (I usually help people migrate off of it instead!)
The algorithm that rebalance uses is (at least the goal was) a direct port, bug for bug of what whisper-fill
does. That algorithm does end up dropping data points on certain boundary cases....and the Go version I wrote duplicates those. I believe this is what you are hitting with this example. You might be interested in fill_test.go which has some tests and demonstrations of exactly this. Run it with go test
in the fill/
directory.
In case of a failover scenario, where metrics meant for a specific node end up on multiple other nodes, the rebalance randomly (or it seems) tends to lose metrics. This only happens when the metric ends up on multiple other nodes instead of a single server.
daemon commands:
Now suppose node 2 goes down for 5 minutes, here's the data for a specific metric:
Here's another example
rebalance command:
bucky rebalance -f
bucky version:0.4.1
Expected results: Points should not be lost