Closed brettsam closed 5 years ago
@binzywu is there a max lock renewal duration on the server by any chance?
I have the similar issue,
I have lock duration set on my topic for 2 seconds, and set MaxAutoRenewDuration=4 seconds.
So I expect the the message lock to be renewed when the message handler takes more than 2 seconds, and I can this happening as i expected.
However every-time when it renews the lock it is setting the new lock token!.
Now, the message handler eventually completes the process and trying to call the CompleteAsync with the lock token it received initially , when this happen I get the messagelocklost exception.
Then I further investigated from service bus explorer, it shows different lock token than I initially received.
Is this how it is expected to behave ? in that case how do I know the current lock token to mark the message as completed. And what is the fix for it ?
@techiearun LockDuration
of 2 seconds and MaxAutoRenewDuration
of 4 seconds sounds a bit suspicious. How long is the message processing? Also, note that MaxAutoRenewDuration
means that the maximum total time a message could be locked would be 4 seconds. And that's not guaranteed since auto-renewal is not a guaranteed operation.
@SeanFeldman The message usually gets processed under 2 seconds, 4 out of 50 messages is taking more than 2 seconds that is around 3 seconds.. so I had set the MaxAutoRenewDuration to 4 seconds with the hope that message lock will be renewed automatically for additional 2 seconds if the message handler didn't complete the process under 2 seconds.
Is this the wrong usage ?
MaxLockDuration
can be set up to 5 minutes. MaxAutoRenewDuration
is designed to allow an option to extend processing beyond 5 minutes (though I would be careful about it).
I would set it up differently. I would set the MaxLockDuration
to up-to 30 seconds and not utilize MaxAutoRenewDuration
altogether. That way, you're guaranteed not to lose the lock while processing is taking place. And if it's truly under 30 seconds processing, you won't see LockLostException
s. And if you do, then investigate what's going on.
@brettsam Thanks for that. I do have another processor that is running with following configuration
Message Lock Duration : 5 Minutes MaxAutoRenewDuration : 301 seconds. ( Based the documentation that states "This value should be greater than the longest message lock duration; for example, the LockDuration Property" )
Expected processing time of a message 3-4 minutes on normal circumstances, can go over 5 minutes at times. So I have set above configurations. Even in this case i get MessageLockLostException when some of message processing time exceeds 5 minutes. I investigated and found that message lock renewed with differnt lock token that causes MessageLockLostException. For me, this behaviour seems to be a bug.
MaxAutoRenewDuration
of 301 seconds? That's 5 minutes and 1 second of total time. It's that what you want? MaxAutoRenewDuration
should be the maximum duration your message should be locked. So if processing can take more than 5 minutes, find the longest duration and use that plus a few seconds.
I investigated and found that message lock renewed with differnt lock token that causes MessageLockLostException. For me, this behaviour seems to be a bug.
Could you link your repro to back this statement?
Back to the original post, what is the best way to deal with AutoComplete = true, MaxRenewDuration = 55 minutes, the handler taking longer than 20 minutes and getting a MessageLockLostException? I'm experiencing the same thing in an application that we're working on.
@Matt-Westphal does it happen for every message that takes 20+ minutes to process?
Personally, I'm in a mindset that processing of 20 minutes should be implemented in a manner where a message is locked for that time. But that's my personal view.
Hello All, does anyone know if the prevoius versions than 3.40 has the same issue? Maybe older versions does not have the issue.
@SeanFeldman yes it was happening for every message. The process is creating multiple VMs in Azure and takes between 20-30 minutes to complete. If it fails we want the message to remain in the queue so that the process can try again. Usually 2nd attempt works.
The code that we had is the same as issue 684. I then modified the code to use AutoComplete = true for the MessageHandler and the MaxAutoRenewDuration to 1 hour. The code was still throwing the MessageLockLostException, but then I found a couple of references "Lock duration" within the properties of the queue. Once I change that from 30 seconds to 5 minutes, the code now completes the message successfully when the process takes 20-30 minutes. It would be nice to set the lock duration to 20 minutes, but the UI isn't letting me enter anything after 5 minutes.
Hello All, this is an issue on the Service side and only happens for entities with 'EnablePartitioning' set to true. We are in the process of fixing this and should be deployed to production clusters in our next update. In the meantime, if possible for your scenario, a potential workaround would be to create an entity with 'EnablePartitioning' set to false and you should not see this issue with that. I will update this thread as I have more information about when the fix may be deployed.
I will update this thread as I have more information about when the fix may be deployed.
@vinaysurya thank you so much for clarifying on the issue. As this is not an issue with the client, it would be great if the broker issue tracker, https://github.com/Azure/azure-service-bus/issues, would contain an issue with symptoms and workarounds. If I may suggest a format, something among the lines of this issue: https://github.com/Particular/NServiceBus.Transport.AzureServiceBus/issues/51. Then this issue could be closed pointing to the broker issue tracker item.
In the meantime, if possible for your scenario, a potential workaround would be to create an entity with 'EnablePartitioning' set to false and you should not see this issue with that.
Please note that deleting an entity and re-creating is not a viable workaround for those that cannot shut down production system for this change. Also, it negates the advise to keep entities partitioned to improve the HA story on the standard tier. A non-partitioned entity will eventually suffer from short outages when containers are moved or replaced, affecting production system much more than systems with partitioned entities.
Broker issue: https://github.com/Azure/azure-service-bus/issues/276
Good suggestion Sean, created an issue https://github.com/Azure/azure-service-bus/issues/276 for Service.
Agree that it may not be possible to re-create entities for every scenario, we'll try to release fix as soon as we can. But wanted to mention it just in case there were people who could use it.
I’m on the WebJobs/Functions team and we’ve got a customer that’s getting a
MessageLockLostException
when running a 20-minute function. I was able to repro with a console app (code below). Some things to call out:Is there anything obviously wrong with this configuration below that would cause this?
I wired up an EventListener to “Microsoft-Azure-ServiceBus” as well to capture those logs. They’re here.