jjneely / buckytools

Go implementation of useful tools for dealing with Graphite's Whisper DBs and Carbon hashing
Other
87 stars 21 forks source link

Can I use buckytools to rebalance a fnv1a_ch cluster? #17

Closed mwtzzz-zz closed 7 years ago

mwtzzz-zz commented 7 years ago

The developers at carbon-c-relay mentioned that I could use this to rebalance a fnv1a_ch hashring. But when i run buckyd, I get the following message: [root@ec2-xxx radar123 bin]$ setuidgid uuu ./buckyd -node ec2-xxx.compute-1.amazonaws.com -hash fnv1a_ch 2017/06/25 22:08:54 Invalide hash type. Supported types: [carbon jump_fnv1a]

Does buckytools support this type of hash? If not, do you know of how I can rebalance my cluster upon adding a new cache host?

grobian commented 6 years ago

all three metrics you sent me, end up on the same hash position (4379), and more annoyingly, that hash position is occupied by two of your servers, f and g. Now you mention k and i, so that's slightly odd, but it could very well be that carbon-c-relay is choosing the last matching entry, whereas the bucky implementation picks the first. This situation is exactly https://github.com/graphite-project/carbon/commit/024f9e67ca47619438951c59154c0dec0b0518c7, which I chose NOT to implement in carbon-c-relay because that would make the input order of servers define the outcome of the ring positions.

grobian commented 6 years ago

Likely reason is that carbon-c-relay nowadays uses a binary-search, which means it approaches the duplicates from the right, instead of the left as the original bisect-left did.

grobian commented 6 years ago

Python implements this:

    if lo < 0:
        raise ValueError('lo must be non-negative')
    if hi is None:
        hi = len(a)
    while lo < hi:
        mid = (lo+hi)//2
        if a[mid] < x: lo = mid+1
        else: hi = mid
    return lo

IoW if carbon-c-relay uses a binary search, it does it wrong, because it should select the leftmost matching key.

mwtzzz-zz commented 6 years ago

Is there anything I should do on my end to correct this?

mwtzzz-zz commented 6 years ago

Thanks Fabian, is there anything I should do on my end to correct it?

On Mon, Oct 9, 2017 at 11:39 AM, Fabian Groffen notifications@github.com wrote:

Python implements this:

if lo < 0:
    raise ValueError('lo must be non-negative')
if hi is None:
    hi = len(a)
while lo < hi:
    mid = (lo+hi)//2
    if a[mid] < x: lo = mid+1
    else: hi = mid

return lo

IoW if carbon-c-relay uses a binary search, it does it wrong, because it should select the leftmost matching key.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jjneely/buckytools/issues/17#issuecomment-335250705, or mute the thread https://github.com/notifications/unsubscribe-auth/AGfAIdHvtqSZDOyOGfCkMc4F8NgCvgDsks5sqmhdgaJpZM4OEzjc .

--

Michael Martinez http://www.michael--martinez.com

grobian commented 6 years ago

It seems to me both bucky and carbon-c-relay implement bisectLeft wrongly. The focus for me is now to align at least bucky and carbon-c-relay (which they are not at the moment). I'm thinking about how to align the algorithm in carbon-c-relay with the bucky implementation (which equals the old carbon-c-relay implementation iirc) as I suspect this will result in minimal (perhaps zero) changes in routing at least for some of my boundary tests. Yesterday I got stuck in trying to understand why my algorithm is so extremely complex compared to the simplicity the python folks came up with. Their implementation is a downright disaster because a lot of metrics will change destination. Sidenote, this change isn't that bad if we separate carbon_ch from fnv1a. carbon-c-relay is "authorative" for fnv1a_ch since it introduced it. For carbon_ch it seems we're just doing it wrong. But I'd love someone to tell me right or wrong here.

mwtzzz-zz commented 6 years ago

Thanks for looking at it. On my end, currently there's about 800 metrics that bucky has misplaced. That's 800 out of 150 million, so a very low percentage. As you mentioned, if any changes are made to relay it would be great if they result in minimal changes in routing.

grobian commented 6 years ago

Now the only thing necessary is that bucky ensures that servers are sorted/ordered in the same way carbon-c-relay does, then the output must be the same.

mwtzzz-zz commented 6 years ago

@grobian do you want me to test that commit?

grobian commented 6 years ago

Yes please, it should result in those 800 metrics being sent to the nodes bucky wants them to be.

mwtzzz-zz commented 6 years ago

Is it going to preserve the locations of the other metrics?

grobian commented 6 years ago

If you want to be sure, try and build a list of metric names (can get it off disk with find, replace all / with . and strip .wsp) then run it through carbon-c-relay -t and ensure all of them return the host you grabbed the conf from. Put differently, if you do this for your current running version and the latest HEAD, you should find running diff -u old new is rather small, where the new points to the box, and old points to another.