madelson / DistributedLock

A .NET library for distributed synchronization
MIT License
1.86k stars 192 forks source link

Add custom data to AzureBlobLeaseDistributedLock #161

Open rLindorfer opened 1 year ago

rLindorfer commented 1 year ago

Problem

Currently, it is not possible to add additional meta-information to a distributed lock:

Proposed solution

I would be useful if it was possible to add additional data to the created blob, e.g. a JSON uploaded as blob content.

  1. add new constructor parameter Stream content to store custom data
  2. instead of simply uploading an empty stream in the CreateIfNotExistsAsync method, use the given content
madelson commented 1 year ago

@rLindorfer thanks for your interest in the library.

A few thoughts/questions:

time stamp when lock got acquired

Note that we actually do attach a timestamp as blob metadata if we're the one who creates the blob. This is creation time rather than lock time, but if the blob did not exist previously then they are going to be very similar.

identity who acquired the lock

Do you mean the Azure concept of identity or something specific to your application? I wonder if the former is automatically available from Azure?

add new constructor parameter Stream content to store custom data

I feel like we'd want a string or byte[] rather than a Stream, since this would be reused across all acquire calls for the given lock instance. Could also be a Func<string>/Func<byte[]> to allow you to generate unique contents.

instead of simply uploading an empty stream in the CreateIfNotExistsAsync method, use the given content

Two points here:

  1. Note that we don't necessarily create a new blob on every aquire. The library supports locking on an existing blob (in that case, we won't delete it on release). I'm reluctant to overwrite the contents of an existing blob.
  2. I feel like it would be better to use metadata rather than content. Both to avoid the overwrite case above and also to avoid extra writes API calls in the case of PageBlobs. Metadata also provides structure which feels nice for encoding this type of information vs. a byte stream.

Thoughts?

madelson commented 1 year ago

@rLindorfer any thoughts on the above?

rLindorfer commented 1 year ago

Thank you for your answer, sorry about my delayed answer, but I got no github notification concerning your reply.

time stamp when lock got acquired

As you already mentioned, it's the creation time and not the lock time. If the blob is reused, the creation time and the lock time will differ.

identity who acquired the lock

I meant something specific to the application, e.g. email of the end-user to show for a example following message: "User X already opened this window..."

add new constructor parameter Stream content to store custom data

string is okay for me, but I thought about a more generic use case where one would like to serialize JSON content directly to the stream instead of creating a huge string into memory.

metadata vs. content

I'm not sure if there is any size limit for the metadata, but yes it would be sufficient for many cases.

madelson commented 1 year ago

Thanks @rLindorfer .

As you already mentioned, it's the creation time and not the lock time. If the blob is reused, the creation time and the lock time will differ.

In the blob reuse case, are you imagining that the metadata values would be added after acquisition and then removed after lock release?

rLindorfer commented 1 year ago

Yes, this would be sufficient.

mjaow commented 1 year ago

any feedback on it? It looks there is no approach to achieve it with azure blob in one transaction, right? Means you cannot get the lease and set the owner which owns the lease atomically. If you break down into 2 transaction, aquire lease and then set owner identity (ip/name/email), there might be inconsistent issue that followers might read staled information. @madelson

madelson commented 1 year ago

@mjaow I think you are correct. Setting the metadata would just be "best effort" to support debugging and shouldn't be relied on as a synchronization mechanism. Arguably callers could just do this themselves. Is that what you're getting at?

mjaow commented 1 year ago

thanks for confirmation. Another question is: are there any throttling issue if we reply on blob storage to acquire/renew lease in a large cluster (number of clients exceeds 1000)?

madelson commented 1 year ago

@mjaow youll just have to test it unfortunately I don’t have personal experience with such a configuration

mjaow commented 1 year ago

ok, thanks. From my test result (5k client), no problem.