Closed mwtzzz-zz closed 7 years ago
all three metrics you sent me, end up on the same hash position (4379), and more annoyingly, that hash position is occupied by two of your servers, f and g. Now you mention k and i, so that's slightly odd, but it could very well be that carbon-c-relay is choosing the last matching entry, whereas the bucky implementation picks the first. This situation is exactly https://github.com/graphite-project/carbon/commit/024f9e67ca47619438951c59154c0dec0b0518c7, which I chose NOT to implement in carbon-c-relay because that would make the input order of servers define the outcome of the ring positions.
Likely reason is that carbon-c-relay nowadays uses a binary-search, which means it approaches the duplicates from the right, instead of the left as the original bisect-left did.
Python implements this:
if lo < 0:
raise ValueError('lo must be non-negative')
if hi is None:
hi = len(a)
while lo < hi:
mid = (lo+hi)//2
if a[mid] < x: lo = mid+1
else: hi = mid
return lo
IoW if carbon-c-relay uses a binary search, it does it wrong, because it should select the leftmost matching key.
Is there anything I should do on my end to correct this?
Thanks Fabian, is there anything I should do on my end to correct it?
On Mon, Oct 9, 2017 at 11:39 AM, Fabian Groffen notifications@github.com wrote:
Python implements this:
if lo < 0: raise ValueError('lo must be non-negative') if hi is None: hi = len(a) while lo < hi: mid = (lo+hi)//2 if a[mid] < x: lo = mid+1 else: hi = mid
return lo
IoW if carbon-c-relay uses a binary search, it does it wrong, because it should select the leftmost matching key.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jjneely/buckytools/issues/17#issuecomment-335250705, or mute the thread https://github.com/notifications/unsubscribe-auth/AGfAIdHvtqSZDOyOGfCkMc4F8NgCvgDsks5sqmhdgaJpZM4OEzjc .
Michael Martinez http://www.michael--martinez.com
It seems to me both bucky and carbon-c-relay implement bisectLeft wrongly. The focus for me is now to align at least bucky and carbon-c-relay (which they are not at the moment). I'm thinking about how to align the algorithm in carbon-c-relay with the bucky implementation (which equals the old carbon-c-relay implementation iirc) as I suspect this will result in minimal (perhaps zero) changes in routing at least for some of my boundary tests. Yesterday I got stuck in trying to understand why my algorithm is so extremely complex compared to the simplicity the python folks came up with. Their implementation is a downright disaster because a lot of metrics will change destination. Sidenote, this change isn't that bad if we separate carbon_ch from fnv1a. carbon-c-relay is "authorative" for fnv1a_ch since it introduced it. For carbon_ch it seems we're just doing it wrong. But I'd love someone to tell me right or wrong here.
Thanks for looking at it. On my end, currently there's about 800 metrics that bucky has misplaced. That's 800 out of 150 million, so a very low percentage. As you mentioned, if any changes are made to relay it would be great if they result in minimal changes in routing.
Now the only thing necessary is that bucky ensures that servers are sorted/ordered in the same way carbon-c-relay does, then the output must be the same.
@grobian do you want me to test that commit?
Yes please, it should result in those 800 metrics being sent to the nodes bucky wants them to be.
Is it going to preserve the locations of the other metrics?
If you want to be sure, try and build a list of metric names (can get it off disk with find, replace all / with . and strip .wsp) then run it through carbon-c-relay -t and ensure all of them return the host you grabbed the conf from. Put differently, if you do this for your current running version and the latest HEAD, you should find running diff -u old new is rather small, where the new points to the box, and old points to another.
The developers at carbon-c-relay mentioned that I could use this to rebalance a fnv1a_ch hashring. But when i run buckyd, I get the following message:
[root@ec2-xxx radar123 bin]$ setuidgid uuu ./buckyd -node ec2-xxx.compute-1.amazonaws.com -hash fnv1a_ch 2017/06/25 22:08:54 Invalide hash type. Supported types: [carbon jump_fnv1a]
Does buckytools support this type of hash? If not, do you know of how I can rebalance my cluster upon adding a new cache host?