Open sfc-gh-akrishnan opened 1 year ago
I don't think we'll be making it configurable, for now. Maybe once #1957 is merged we could consider it, but i'm not sure it's worth exposing those knobs for now. Do you have a use case for configuring this?
When I wrote that test, it was a bit flaky so I increased the delta for us to revisit these numbers in the future. If you want to mess around with the values / distribution / tests go for it :) I found that these values have resulted in a pretty even distribution in my clusters.
I don't have a strong use-case. I wanted to play with the configuration on what makes the best sense in our cluster.
On the second item, I actually see quite a bit of variance. With 214 targets to scrape and with 3 collectors, the average load is expected to be ~71 targets per collector. With a 10% load imbalance allowed, I expected the distribution to be in the range [64, 78]. But the observation was different:
{'otel-collector-1': 77, 'otel-collector-2': 80, 'otel-collector-0': 57}
Thus I started reading the test case, and the test was rather for 50% variance. I wanted to understand what was the missing part of my understanding. With increasing collectors, I see the variance reducing. Do you happen to have any insights from experience to share to me here? Please also let me know if there is a flaw in understanding
honestly, i'm not sure. We're using a library to manage the distribution and I would probably check there for more information.
The consistent_hashing.go module in https://github.com/open-telemetry/opentelemetry-operator/blob/main/cmd/otel-allocator/allocation/consistent_hashing.go is using the consistent hash implementation from github.com/buraksezer/consistent.
The load value is initialized to 1.1.
Q1) Is there any plans to make these values configurable in the near future? Or is there a knob already available?
Q2) Following up, as my understanding goes reading upon the blogs, the expectation that in the worst case the most loaded value is only 10% deviated from the average load as Load value is set to 1.1. If that be the case, the test case
TestRelativelyEvenDistribution
at https://github.com/open-telemetry/opentelemetry-operator/blob/main/cmd/otel-allocator/allocation/consistent_hashing_test.go, tests if the load is 50% off and not 10%. What am I missing?Appreciate the maintainers patience in helping out with an answer : )
Thank in advance