bsm / redislock

Simplified distributed locking implementation using Redis
Other
1.45k stars 157 forks source link

Two questions about the implementation #30

Closed kiyonlin closed 3 years ago

kiyonlin commented 3 years ago

Hey, thanks for the awesome work! And I have two questions about current implementation, so I open this issue.

1. Race condition

I found this in official website section Distributed locks with Redis:

Superficially this works well, but there is a problem: this is a single point of failure in our architecture. What happens if the Redis master goes down? Well, let’s add a slave! And use it if the master is unavailable. This is unfortunately not viable. By doing so we can’t implement our safety property of mutual exclusion, because Redis replication is asynchronous.

There is an obvious race condition with this model:

  • Client A acquires the lock in the master.
  • The master crashes before the write to the key is transmitted to the slave.
  • The slave gets promoted to master.
  • Client B acquires the lock to the same resource A already holds a lock for. SAFETY VIOLATION!

What do you think?

2. Auto refresh

Assuming we have code snippet like

func main() {
    locker := redislock.New(client)

    ctx := context.Background()

    lock, err := locker.Obtain(ctx, "my-key", 100*time.Millisecond, nil)
    if err == redislock.ErrNotObtained {
        fmt.Println("Could not obtain lock!")
    } else if err != nil {
        log.Fatalln(err)
    }
    defer lock.Release(ctx)

    // Try to do business logic here, and it exceeds expiration time
    // Then other locker(s) can obtain the lock before it's released
}

We may get problem when business logic takes longer time than the expiration time.

My solution to solve this issue is add a watch dog as Redission does:

// If user passes AutoRefresh as the ttl param, we should set a default
// ttl for this lock, and auto refresh it after every ttl/3 duration.
const AutoRefresh = time.Duration(0)

// watch extends the lock with a new TTL automatically.
func (l *Lock) watch() {
    // Implementation detail
}

But yeah, this can be implemented by users themselves.

Thanks! Kiyon.

dim commented 3 years ago

Hi:

Re 1: This implementation follows https://redis.io/topics/distlock#correct-implementation-with-a-single-instance

Re 2: The problem with auto-refresh is that errors may occur and those would be happening in the background. This library is designed to be relatively low-level and I don't want to make assumptions on how users would like to handle those refresh errors. That's why I implemented a simple https://pkg.go.dev/github.com/bsm/redislock#Lock.Refresh which allows you to write your very own wrapping logic and error handling. For example:

lock, err := locker.Obtain(ctx, "my-key", 100*time.Millisecond, nil)
// ...
defer lock.Release()

refreshCtx, cancel := context.WithCancel(ctx)
defer cancel()

go func() {
  ticker := time.NewTicker(20*time.Millisecond)
  defer ticker.Stop()

  for {
     select {
     case <-refreshCtx.Done():
       return
     case <-ticker.C:
       if err := lock.Refresh(refreshCtx, 100*time.Millisecond, nil); err != nil {
          // handle refresh errors here: log, instrument, do whatever your app needs ...
       }
     }
  }
}()