Azure / azure-service-bus-dotnet

☁️ .NET Standard client library for Azure Service Bus
https://azure.microsoft.com/services/service-bus
Other
235 stars 120 forks source link

ScheduleMessageAsync when used with duplicate detection enabled #630

Closed stevehurcombe closed 5 years ago

stevehurcombe commented 5 years ago

Background

I'm trying to use a queue to schedule the sending of a notification email to a user at some point in the future. The messages get picked up by an Azure Function and processed using SendGrid.

The user may wish to cancel and reschedule the message (hence the SequenceNumber is important) and I don't want them to receive duplicate messages. Duplicated messages should be rare scenarios but something I want to handle as they could become an issue at scale.

The messages are scheduled using a mobile device via an Azure Function hence the concern about connectivity.

The queues ability to deduplicate messages is important as it solves issues with distributed locking.

Actual Behavior

Given a queue with duplicate detection enabled, in this case the time window is set at 20 seconds, I received two sequence numbers. The second message is ignored, the first message survives.

I used a simple [Test] to verify this.

IQueueClient queueClient = new QueueClient(serviceBusConnectionString, "testqueue");

var message = new Message(Encoding.UTF8.GetBytes(JsonConvert.SerializeObject(sendEmailRequest)));
message.MessageId = "1"; // Prevent duplicate messages

// Return the sequence number for later cancellation if required
long SequenceNumber1 = await queueClient.ScheduleMessageAsync(message, sendEmailRequest.sendon);
// Thread.Sleep(25000); // Wait till we add the next one
long SequenceNumber2 = await queueClient.ScheduleMessageAsync(message, sendEmailRequest.sendon);

The Thread.Sleep() line is used when I do want to test queueing two messages for the same time slot in the future.

I need the SequenceNumber in order to be able to cancel the message later on. However the sequence number that I receive is actually the one that failed (since the first response may get lost in transit for example) I will not be able to cancel that message.

Expected Behavior

I would like the SequenceNumber of the surviving message to be returned, regardless of how many times I attempt to add a message (while within the de-duplication window).

OR

A way of knowing that the second attempt failed and that I therefore need to peek into the queue.

Actual Behavior

When a scheduled message reaches its delivery date a new message is created. In this scenario both scheduled messages have identical sendon dates and so their new messages will hit the queue at the same time. These new messages are not deduplicated and two remain in the queue for processing.

The first one has the state active and the second has the state scheduled.

This seems odd in any case, I would have expected two active messages (if deduplication wasn't happening) or a single one if it was.

Expected Behavior

I would have expected the dedeuplication process to have deduplicated the second messages as well.

The full scenario I'm trying to mitigate is this:

SeanFeldman commented 5 years ago

Could you please share the exact test code you've used including the assertions? Thanks.

stevehurcombe commented 5 years ago

Hi Sean, That is all the code I have (other than setting the connection string). The assertions are manual observations of the queue. I'm using Azure functions to pull the messages off the queue, I'm afraid I haven't done this using the SDK.

Do you know what the official functionality should be? The documentation doesn't seem to cover this scenario.

SeanFeldman commented 5 years ago

When messages are added to a qeueu (or scheduled) within de-duplication window, they will be de-duplicated when the first message will stay and duplicates (second and later messages) will be dropped. That's the definition of de-duplication. If you enqueue/schedule message within time larger than de-duplication window, those would not be considered duplicates. As far as I can see, this works as designed.

stevehurcombe commented 5 years ago

Surely the deduplication should happen when the messages enter the queue, at the second stage? There are 4 messages remember.

However there is still no way of knowing that de-duplication has occurred and therefore a de-duplicated message cannot be cancelled.

SeanFeldman commented 5 years ago

There are 4 messages remember.

Sorry, I'm not sure what do you mean by that and where "4 messages" are coming from.

Thinking out loud: if you have a message you're sending to as scheduled in the future that you need to hold on to the SequenceNumber to be able to cancel it later, you already have some sort of storage I presume associated with that unique ID assigned to the message. Next time you dispatch a message with that same ID, ASB will return you another SequenceNumber. Rather than storing it, could you first check if that unique ID already has a SequenceNumber associated with it? If yes, that means your 2nd message you've tried to dispatch was a duplicate.

Now you'll face two options. If the original record was stored less than de-duplication window time, the record contains the SequenceNumber you'll have to use for cancellation. Otherwise, you'd need to either overwrite SequenceNumber with the latest received one as the previous SequenceNumber has "gone through" and ASB obviously did not de-duplicate the 2nd message with the same ID.

All this aside, if you truly want the broker to add this functionality for you, you'll have to raise is in the service repo. This is a client repo. And a client is only implementing what a broker is capable of. Broker repo: https://github.com/Azure/azure-service-bus/issues

stevehurcombe commented 5 years ago

Hi Sean, I think you're right I'm in the wrong repo. Azure support sent me here!

There are 2 messages for every scheduled message. The first waits until the deadline and the second one is the one that the processor receives.

In the scenario I'm looking at the first attempt at adding a message to the queue fails due to network issues, therefore the client doesn't know that the first message has been added. The client retries and a second message is attempted, it will fail due to the duplicate checks but the client won't know. Additionally the sequence number it is given is useless because it's for the rejected message and any attempt to cancel the message will fail.

I will raise an issue on the service bus repo.

stevehurcombe commented 5 years ago

For reference, I've added two issues to the service bus repo:

ScheduleMessageAsync when used with duplicate detection enabled https://github.com/Azure/azure-service-bus/issues/265#issue-409752826

ScheduleMessageAsync when used with duplicate detection enabled, second scenario https://github.com/Azure/azure-service-bus/issues/266#issue-409755017