buraksezer / consistent

Consistent hashing with bounded loads in Golang
https://research.googleblog.com/2017/04/consistent-hashing-with-bounded-loads.html
MIT License
693 stars 69 forks source link

A corner case will cause panic. #13

Open ShiKaiWi opened 4 years ago

ShiKaiWi commented 4 years ago

This demo will panic:

package main

import (
    "github.com/buraksezer/consistent"
    "github.com/cespare/xxhash"
)

type hasher struct{}

type myMember string

func (m myMember) String() string {
    return string(m)
}

func (h hasher) Sum64(data []byte) uint64 {
    // you should use a proper hash function for uniformity.
    return xxhash.Sum64(data)
}

func main() {
    cfg := consistent.Config{
        PartitionCount:    1,
        ReplicationFactor: 1,
        Load:              1,
        Hasher:            hasher{},
    }
    c := consistent.New(nil, cfg)

    node1 := myMember("node1")
    c.Add(node1)
}

Here is the error message:

panic: not enough room to distribute partitions

goroutine 1 [running]:
github.com/buraksezer/consistent.(*Consistent).distributeWithLoad(0x450060, 0x0, 0x0, 0x43e2e0, 0x43e2c0, 0x34c96acd)
    /tmp/gopath552882815/pkg/mod/github.com/buraksezer/consistent@v0.0.0-20191006190839-693edf70fd72/consistent.go:165 +0x3a0
github.com/buraksezer/consistent.(*Consistent).distributePartitions(0x450060, 0x1604d0)
    /tmp/gopath552882815/pkg/mod/github.com/buraksezer/consistent@v0.0.0-20191006190839-693edf70fd72/consistent.go:196 +0xc0
github.com/buraksezer/consistent.(*Consistent).Add(0x450060, 0x1604d0, 0x40c150, 0x2b5f)
    /tmp/gopath552882815/pkg/mod/github.com/buraksezer/consistent@v0.0.0-20191006190839-693edf70fd72/consistent.go:227 +0x180
main.main()
    /tmp/sandbox195891616/prog.go:30 +0xe0

So this is expected behavior or a bug?

ShiKaiWi commented 4 years ago

It seems this condition is wrong(https://github.com/buraksezer/consistent/blob/master/consistent.go#L163):

if count >= len(c.sortedSet) {

Is it reanable to remove equal?:

if count > len(c.sortedSet) {
buraksezer commented 4 years ago

Hello,

Your configuration is invalid. So this is the expected behavior. You should use something like that:

    // Create a new consistent instance
    cfg := consistent.Config{
        PartitionCount:    7,
        ReplicationFactor: 20,
        Load:              1.25,
        Hasher:            hasher{},
    }

Configuration section in README file explains these variables https://github.com/buraksezer/consistent#configuration

If you explain your use case, I may help further for the configuration.

Thanks!

ShiKaiWi commented 4 years ago

I have checked the definition of variables in consistent.Config but I still can't understand why the case is wrong:

cfg := consistent.Config{
    PartitionCount:    1,
    ReplicationFactor: 1,
    Load:              1,
    Hasher:            hasher{},
}
c := consistent.New(nil, cfg)

node1 := myMember("node1")
c.Add(node1)

One node with one replication for one partition seems reasonable.

sriramch commented 4 years ago

@buraksezer is it possible for you to fix what @ShiKaiWi is reporting? i'm running into a similar issue as well. this patch seems reasonable and that's what i did to keep going.

we may want to run (unit/functional) tests where we try to squeeze a small number of partitions on a single node and the logic here does seem incorrect. the count that checks to see if we have exhausted all nodes should be incremented only after we check to see if the partition cannot be accommodated - not before. either the count should be incremented at the end of the loop or the patch that @ShiKaiWi provides should be done.

if you wish, i can create a pr as well. please let us know.

buraksezer commented 3 years ago

@sriramch firstly, I'm so sorry for the late response. The reported configuration by @ShiKaiWi was invalid. Why do you want to use a consistent hash function if you have only one partition? ReplicationFactor and Load parameters should be bigger than 1. Furthermore, these parameters should be chosen wisely.

If you provide a sample configuration, I may understand what the actual problem is. I'm OK with the changes, if it fixes a real problem.

sriramch commented 3 years ago

@buraksezer thanks for responding!

as i mentioned earlier, we may want to run unit/functional tests with a smaller number of partitions on one or few nodes with just 1 replica to simulate some failure scenarios. for instance, the following would panic.

here the PartitionCount and the Load are indeed > 1.

    numNodes := 2 // 1 or 2
    numReplicas := 1
    members := []consistent.Member{}

    for i := 0; i < numNodes; i++ {
        member := Member(fmt.Sprintf("node%d.olricmq", i))
        members = append(members, member)
    }

    cfg := consistent.Config{
        PartitionCount:    10, // partition range for modulo
        ReplicationFactor: numReplicas,  // how many times each member is replicated
        Load:              1.1,
        Hasher:            murmur3_hasher{},
    }

    c := consistent.New(members, cfg)

my earlier comment explains the cause and a potential fix. why can't it be accommodated, and why is it invalid to have a non trivial number of partitions on one or a couple of nodes with one replica? why is this incorrect conceptually?

shouldn't the count (that tracks if we have exhausted all nodes) be incremented only after checking to see if a partition cannot be accommodated on the node as opposed to before as it is currently done?

danielkurniadi commented 3 years ago

@ShiKaiWi @sriramch I think @buraksezer is right.

  1. If you want a bounded load you cannot put whatsoever edge cases that breaks the rule math.Ceil(c * m / n). On the other hand though I recommend we can pass an error instead of panicking. We can also check in the constructor if the parameters is valid.
  2. On the other hand, I'm also unsure why the count is incremented before. This means only n-1 nodes is examined before exhaustion.
  3. But does fixing (2) also fixes the panic in (1)? There are plenty of other parameter configuration that won't fix the panic regardless we fix the counter or not because the underlying parameters just doesn't make sense for load balancing.

Meanwhile for @buraksezer I have a mathematical question on the parameter settings. Is it possible to derive mathematically that a given the values for parameters ReplicaFactor, PartitionCount, and Load would be valid or invalid?

In addition to that, what would be a good parameter set for production environment? (good as in, just valid. performant wise I can definitely benchmark test). Let's say on average, I will have 5.6 members. Variance 1.2 memberSquare.

sriramch commented 3 years ago

@iqDF - thanks for your comments.

If you want a bounded load you cannot put whatsoever edge cases that breaks the rule math.Ceil(c * m / n)

that is an implementation detail. as i asked earlier, conceptually why is distributing a non trivial number of partitions over one or a couple of nodes with one replica incorrect (with a load factor > 1)?

But does fixing (2) also fixes the panic in (1)?

it should and i have mentioned it here as well last line in the 2nd paragraph

There are plenty of other parameter configuration that won't fix the panic regardless we fix the counter or not because the underlying parameters just doesn't make sense for load balancing.

can you please provide some examples? i was under the impression that we should be able to accommodate p partitions on n nodes with r replicas (for values of p/n/r that conceptually makes sense) by adjusting the load factor

purplefox commented 3 weeks ago

I am hitting this same issue, config is:

cfg := consistent.Config{
        PartitionCount:    100, 
        ReplicationFactor: 1
        Load:              1.25,
        Hasher:            hasher{}
    }

I add a single member, and get the same panic as above.

The panic occurs because replicationFactor is 1, so, after a single member is added there is only one entry in c.sortedSet.

This means count is always equal to len(sortedSet):

count++
if count >= len(c.sortedSet) {
    // User needs to decrease partition count, increase member count or increase load factor.
    panic("not enough room to distribute partitions")
}

And it always panics. Note that here the load has not been exceeded - load = 100, maximum load = 125.

What am I doing wrong here?

buraksezer commented 3 weeks ago

Hey, @purplefox ReplicationFactor should be bigger than 1. Why do you want to set 1 to ReplicationFactor? Setting a ReplicationFactor bigger than 1 is an essential factor to distribute load evenly between members.

I think we should add a check that validates the configuration.

purplefox commented 3 weeks ago

I am creating a non replicated, but distributed cache. There are multiple members in the cluster and a particular key should live on exactly one member. I am using consistent hashing to determine which member the key lives on. A key only lives on one member so it is not replicated, so replication factor is 1. This seems like a very common use case for consistent hashing (e.g. redis and many other things)

buraksezer commented 3 weeks ago

ReplicationFactor is just an implementation detail of this package. Replicating data is your project's implementation detail. You can use the default ReplicationFactor value and LocateKey method to find a host for the key.

You should know that this package implements a modified version of this algorithm for Olric.

purplefox commented 3 weeks ago

Thanks. If replication factor does not mean what I thought it meant, what does it represent?

buraksezer commented 3 weeks ago

It's related to the consistent hashing technique: https://ably.com/blog/implementing-efficient-consistent-hashing

purplefox commented 3 weeks ago

I think you're referring to a technique in consistent hashing where a server node is added multiple times on the ring (not just once) in order to make distribution more uniform. I am familiar with this technique but never heard it called "replication factor" before. IMO using the term "replication factor" is somewhat confusing as that term already. has a well known meaning in distributed systems. Thanks for the quick reply.