madelson / DistributedLock

A .NET library for distributed synchronization
MIT License
1.75k stars 182 forks source link

Adding a synchronization channel #142

Open gliljas opened 1 year ago

gliljas commented 1 year ago

When diagnosing #141 I compared our own distributed lock implementation with this one. One, rather big, difference as that we have a long "BusyWaitSleepTime", which is allowed to be long since the lock provider also communicates the acquiring and releasing of locks through a pub/sub channel. In our case that channel is a Redis PubSub. If the channel fails for some reason, the wait timeout will be respected, but in many cases it will never go that far. The implementation is quite simple with a concurrent dictionary of reference counted cancellationtokensources. One cancellationtokensource per active lock key. This cancellationtokensource is used in the Task.Delay.

I would suggest adding such a mechanism to DistributedLock. It would be completely opt-in in nature, and the implementation can be completely agnostic to the provider chosen, just as in our case where Azure Blobs and Redis are combined.

madelson commented 1 year ago

This is an interesting request. I'm imagining that the caller would have to provide their own implementation of an interface like this:

interface IReleaseEvent
{
    IDisposable Subscribe(string name, Action callback); // callback fires whenever Publish(name) is called.
    Task PublishAsync(string name);
    void Publish(string name);
}

var @lock = new AzureBlobLeaseDistributedLock(..., options: o => o.ReleaseEvent(myReleaseEvent));

Is that what you were imagining? I think you can emulate this behavior with the library's existing API:

public static async ValueTask<IAsyncDisposable?> TryAcquireAsync(this IDistributedLock @lock, TimeSpan timeout, CancellationToken cancellationToken, IReleaseEvent @event)
{
    if (await @lock.TryAcquireAsync(TimeSpan.Zero, cancellationToken) is { } handle) { return MakeHandle(handle); }
    if (timeout == TimeSpan.Zero) { return null; }

    var stopwatch = Stopwatch.StartNew();
    while (timeout == Timeout.InfiniteTimeSpan || stopwatch.Elapsed <= timeout)
    {
        var releaseEventTask = new TaskCompletionSource<bool>();
        using var subscription = @event.Subscribe(@lock.Name, () => releaseEventTask.SetCompleted());
        await await Task.WhenAny(releaseEventTask.Task, Task.Delay(TimeSpan.FromSeconds(5), cancellationToken));
        if (await @lock.TryAcquireAsync(TimeSpan.Zero, cancellationToken) is { } handle) { return MakeHandle(handle); }
    }

    IAsyncDisposable MakeHandle(IAsyncDisposable handle) => new HandleWrapper(@lock, handle, @event);
}

private class HandleWrapper : IAsyncDisposable 
{
    private IDistributedLock _lock;
    private IAsyncDisposable _handle;
    private IReleaseEvent _event;

    public  HandleWrapper(IDistributedLock @lock, IAsyncDisposable handle, IReleaseEvent @event)
    {
        // assign
    }

    public async ValueTask DisposeAsync()
    {
        if (Interlocked.Exchange(ref this._handle, null!) is { } handle)
        {
            await handle.DisposeAsync();
            await this._event.PublishAsync(this._lock.Name);
        }
    }
}

Another question: since you are using Redis anyway, why not just use Redis for your distributed locking rather than mixing Redis and Azure? With Redis, we could potentially build pub-sub into the library directly without requiring the caller to wire it up.

gliljas commented 1 year ago

Is that what you were imagining?

That's almost verbatim what as was imagining. :)

I think you can emulate this behavior with the library's existing API:

Great idea! I'll give it a go.

Another question: since you are using Redis anyway, why not just use Redis for your distributed locking rather than mixing Redis and Azure?

We're evaluating that, but have some reservations as to the production worthiness of our Redis setup, for this purpose. The RedLock algorithm basically requires multiple masters for conflict free redundancy, whereas our Redis setup is master-slave with Sentinel failover. But we could certainly add more nodes, so it may be the way forward.

With Redis, we could potentially build pub-sub into the library directly without requiring the caller to wire it up.

Indeed, and that may be the way to go. A great way to test the feasibility anyway. But providing the means to accomplish this for any provider where polling is the only choice could also be really great. E.g for the Azure scenario, an IReleaseEvent based on Service Bus topics could make sense.

And BTW, I saw #38 . We had that too, using named semaphores, and I believe it reduced the strain on the Azure blob account quite a lot. I'll add that to the implementation of your great idea to wrap the locking mechanisms.