rebus-org / Rebus.AzureServiceBus

:bus: Azure Service Bus transport for Rebus
https://mookid.dk/category/rebus
Other
33 stars 20 forks source link

Suddenly alot of lock exceptions is heading my way #12

Closed KennethJakobsen closed 6 years ago

KennethJakobsen commented 6 years ago

Suddenly i'm experiencing a lot of lock exceptions like this. Do you know why this is?

Rebus.Workers.ThreadPoolBased.ThreadPoolWorker ERROR (Thread #16): An error occurred when attempting to complete the transaction context
Rebus.Exceptions.RebusApplicationException: The message lock for message with ID 1e9682b3-e6ab-4c69-b702-88da2d957de4 was lost - tried to complete after 4,0 s ---> Microsoft.ServiceBus.Messaging.MessageLockLostException: The lock supplied is invalid. Either the lock expired, or the message has already been removed from the queue. TrackingId:43249aa7-e8c5-4c3d-ac91-0007c7d98ca6_G24_B2, SystemTracker:umbksjdev:Queue:letsencrypt-input, Timestamp:4/30/2018 2:13:30 PM ---> System.ServiceModel.FaultException`1[System.ServiceModel.ExceptionDetail]: The lock supplied is invalid. Either the lock expired, or the message has already been removed from the queue. TrackingId:43249aa7-e8c5-4c3d-ac91-0007c7d98ca6_G24_B2, SystemTracker:umbksjdev:Queue:letsencrypt-input, Timestamp:4/30/2018 2:13:30 PM
   at Microsoft.ServiceBus.Messaging.Sbmp.DuplexRequestBindingElement.DuplexRequestSessionChannel.ThrowIfFaultMessage(Message wcfMessage)
   at Microsoft.ServiceBus.Messaging.Sbmp.DuplexRequestBindingElement.DuplexRequestSessionChannel.HandleMessageReceived(IAsyncResult result)

Kenneth

mookid8000 commented 6 years ago

This usually happens when your handler takes too long to execute.

By default, Rebus will take peek locks on Azure Service Bus messages for the maximum allowed time, which is 5 minutes.

If your handler takes longer than that to finish its work, it will get MessageLockLostException when it tries to acknowledge (and thus delete) the message.

You should either a) take care to make handling of messages quicker, to be well within the 5 minute limit, or b) enable Rebus' automatic peek lock renewal (instructions can be found here), which will make Rebus continually renew the peek locks when 80% of the time has elapsed.

Prefer (a) over (b) (because the peek lock renewal comes with a slight overhead), but sometimes that's not possible, and then (b) is ok 😉

mookid8000 commented 6 years ago

Now I see it says The message lock for message with ID 1e9682b3-e6ab-4c69-b702-88da2d957de4 was lost - tried to complete after 4,0 s in your exception message – has the default peek locks interval been changed on the queue?

KennethJakobsen commented 6 years ago

@mookid8000 no I don't think so, but it was my first guess as well, but it's not: I am however chaining a lot of messages, this shouldn't be an issue right? It seems that whenever a messages is taking longer than 3 secs. to process i'm having issues. image

Rebus.Pipeline.Send.SendOutgoingMessageStep DEBUG (Thread #52): Sending Concorde.LetsEncrypt.Contracts.Commands.ProcessNewHostnameCommand -> "letsencrypt-input"
Rebus.Pipeline.Receive.DispatchIncomingMessageStep DEBUG (Thread #52): Dispatching "Concorde.LetsEncrypt.Contracts.Commands.AddNewHostnameCommand, Concorde.LetsEncrypt.Contracts" "f2aad6c1-10f2-4e59-a0d8-5ec18949c785" to 1 handlers took 848 ms
Rebus.Pipeline.Send.SendOutgoingMessageStep DEBUG (Thread #61): Sending Concorde.LetsEncrypt.Contracts.Commands.InitiateAcmeChallengeCommand -> "letsencrypt-input"
Rebus.Pipeline.Receive.DispatchIncomingMessageStep DEBUG (Thread #61): Dispatching "Concorde.LetsEncrypt.Contracts.Commands.ProcessNewHostnameCommand, Concorde.LetsEncrypt.Contracts" "dfc9e790-0b91-4935-9821-700c7e2abbbe" to 1 handlers took 277 ms
Rebus.Pipeline.Send.SendOutgoingMessageStep DEBUG (Thread #52): Sending Concorde.LetsEncrypt.Contracts.Commands.PersistChallengeInfoCommand -> "letsencrypt-input"
Rebus.Pipeline.Receive.DispatchIncomingMessageStep DEBUG (Thread #52): Dispatching "Concorde.LetsEncrypt.Contracts.Commands.InitiateAcmeChallengeCommand, Concorde.LetsEncrypt.Contracts" "41ba644a-5be6-43c9-9fc2-ca296cfe3a16" to 1 handlers took 3249 ms
Rebus.Pipeline.Send.SendOutgoingMessageStep DEBUG (Thread #9): Sending Concorde.LetsEncrypt.Contracts.Commands.VerifyChallengeInfoCommand -> "letsencrypt-input"
Rebus.Pipeline.Receive.DispatchIncomingMessageStep DEBUG (Thread #9): Dispatching "Concorde.LetsEncrypt.Contracts.Commands.PersistChallengeInfoCommand, Concorde.LetsEncrypt.Contracts" "9f5bce81-ad50-4868-919d-18d229e546b0" to 1 handlers took 3210 ms
Rebus.Pipeline.Send.SendOutgoingMessageStep DEBUG (Thread #52): Sending Concorde.LetsEncrypt.Contracts.Commands.GeneratePfxCertificateCommand -> "letsencrypt-input"
Rebus.Pipeline.Receive.DispatchIncomingMessageStep DEBUG (Thread #52): Dispatching "Concorde.LetsEncrypt.Contracts.Commands.VerifyChallengeInfoCommand, Concorde.LetsEncrypt.Contracts" "4145071e-9175-4482-8a76-bc59d7cf03a5" to 1 handlers took 2455 ms
Rebus.Pipeline.Send.SendOutgoingMessageStep DEBUG (Thread #9): Sending Concorde.LetsEncrypt.Contracts.Commands.InstallCertificateCommand -> "letsencrypt-input"
Rebus.Pipeline.Receive.DispatchIncomingMessageStep DEBUG (Thread #9): Dispatching "Concorde.LetsEncrypt.Contracts.Commands.GeneratePfxCertificateCommand, Concorde.LetsEncrypt.Contracts" "c4ff70ce-3926-4739-9e4a-7f9d487e1cf1" to 1 handlers took 5019 ms
Rebus.Workers.ThreadPoolBased.ThreadPoolWorker ERROR (Thread #60): An error occurred when attempting to complete the transaction context
Rebus.Exceptions.RebusApplicationException: The message lock for message with ID c4ff70ce-3926-4739-9e4a-7f9d487e1cf1 was lost - tried to complete after 6,1 s ---> Microsoft.ServiceBus.Messaging.MessageLockLostException: The lock supplied is invalid. Either the lock expired, or the message has already been removed from the queue. TrackingId:e8155f97-6b20-4a25-9bf2-7ac9b937d139_G26_B2, SystemTracker:umbksjdev:Queue:letsencrypt-input, Timestamp:5/1/2018 7:13:10 AM ---> System.ServiceModel.FaultException`1[System.ServiceModel.ExceptionDetail]: The lock supplied is invalid. Either the lock expired, or the message has already been removed from the queue. TrackingId:e8155f97-6b20-4a25-9bf2-7ac9b937d139_G26_B2, SystemTracker:umbksjdev:Queue:letsencrypt-input, Timestamp:5/1/2018 7:13:10 AM
   at Microsoft.ServiceBus.Messaging.Sbmp.DuplexRequestBindingElement.DuplexRequestSessionChannel.ThrowIfFaultMessage(Message wcfMessage)
   at Microsoft.ServiceBus.Messaging.Sbmp.DuplexRequestBindingElement.DuplexRequestSessionChannel.HandleMessageReceived(IAsyncResult result)
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at Microsoft.ServiceBus.Common.AsyncResult.End[TAsyncResult](IAsyncResult result)
   at Microsoft.ServiceBus.Messaging.Sbmp.DuplexRequestBindingElement.DuplexRequestSessionChannel.EndRequest(IAsyncResult result)
   at Microsoft.ServiceBus.Messaging.Channels.ReconnectBindingElement.ReconnectChannelFactory`1.RequestSessionChannel.RequestAsyncResult.<>c.<GetAsyncSteps>b__9_3(RequestAsyncResult thisPtr, IAsyncResult r)
   at Microsoft.ServiceBus.Messaging.IteratorAsyncResult`1.StepCallback(IAsyncResult result)
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at Microsoft.ServiceBus.Common.AsyncResult.End[TAsyncResult](IAsyncResult result)
   at Microsoft.ServiceBus.Common.AsyncResult`1.End(IAsyncResult asyncResult)
   at Microsoft.ServiceBus.Messaging.Channels.ReconnectBindingElement.ReconnectChannelFactory`1.RequestSessionChannel.EndRequest(IAsyncResult result)
   at Microsoft.ServiceBus.Messaging.Sbmp.RedirectBindingElement.RedirectContainerChannelFactory`1.RedirectContainerSessionChannel.RequestAsyncResult.<>c__DisplayClass8_1.<GetAsyncSteps>b__4(RequestAsyncResult thisPtr, IAsyncResult r)
   at Microsoft.ServiceBus.Messaging.IteratorAsyncResult`1.StepCallback(IAsyncResult result)
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at Microsoft.ServiceBus.Common.AsyncResult.End[TAsyncResult](IAsyncResult result)
   at Microsoft.ServiceBus.Common.AsyncResult`1.End(IAsyncResult asyncResult)
   at Microsoft.ServiceBus.Messaging.Sbmp.RedirectBindingElement.RedirectContainerChannelFactory`1.RedirectContainerSessionChannel.EndRequest(IAsyncResult result)
   at Microsoft.ServiceBus.Messaging.Channels.ReconnectBindingElement.ReconnectChannelFactory`1.RequestSessionChannel.RequestAsyncResult.<>c.<GetAsyncSteps>b__9_3(RequestAsyncResult thisPtr, IAsyncResult r)
   at Microsoft.ServiceBus.Messaging.IteratorAsyncResult`1.StepCallback(IAsyncResult result)
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at Microsoft.ServiceBus.Common.AsyncResult.End[TAsyncResult](IAsyncResult result)
   at Microsoft.ServiceBus.Common.AsyncResult`1.End(IAsyncResult asyncResult)
   at Microsoft.ServiceBus.Messaging.Channels.ReconnectBindingElement.ReconnectChannelFactory`1.RequestSessionChannel.EndRequest(IAsyncResult result)
   at Microsoft.ServiceBus.Messaging.Sbmp.SbmpTransactionalAsyncResult`1.<>c.<GetAsyncSteps>b__18_3(TIteratorAsyncResult thisPtr, IAsyncResult a)
   at Microsoft.ServiceBus.Messaging.IteratorAsyncResult`1.StepCallback(IAsyncResult result)
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at Microsoft.ServiceBus.Common.AsyncResult.End[TAsyncResult](IAsyncResult result)
   at Microsoft.ServiceBus.Messaging.Sbmp.SbmpMessageReceiver.EndUpdateCommand(IAsyncResult result)
   --- End of inner exception stack trace ---
   at Microsoft.ServiceBus.Common.AsyncResult.End[TAsyncResult](IAsyncResult result)
   at Microsoft.ServiceBus.Messaging.MessageReceiver.RetryMessagesOperationAsyncResult.End(IAsyncResult r)
   at Microsoft.ServiceBus.Messaging.MessageReceiver.EndComplete(IAsyncResult result)
   at Microsoft.ServiceBus.Messaging.ReceiveContext.EndComplete(IAsyncResult result)
   at Microsoft.ServiceBus.Messaging.BrokeredMessage.EndComplete(IAsyncResult result)
   at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Rebus.AzureServiceBus.AzureServiceBusTransport.<>c__DisplayClass34_0.<<Receive>b__5>d.MoveNext()
   --- End of inner exception stack trace ---
   at Rebus.AzureServiceBus.AzureServiceBusTransport.<>c__DisplayClass34_0.<<Receive>b__5>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Rebus.Transport.TransactionContext.<Invoke>d__24.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Rebus.Transport.TransactionContext.<RaiseCompleted>d__22.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.ConfiguredTaskAwaitable.ConfiguredTaskAwaiter.GetResult()
   at Rebus.Transport.TransactionContext.<Complete>d__19.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.ConfiguredTaskAwaitable.ConfiguredTaskAwaiter.GetResult()
   at Rebus.Workers.ThreadPoolBased.ThreadPoolWorker.<ProcessMessage>d__18.MoveNext()
Rebus.Pipeline.Send.SendOutgoingMessageStep DEBUG (Thread #52): Sending Concorde.LetsEncrypt.Contracts.Commands.InstallCertificateCommand -> "letsencrypt-input"
Rebus.Pipeline.Receive.DispatchIncomingMessageStep DEBUG (Thread #52): Dispatching "Concorde.LetsEncrypt.Contracts.Commands.GeneratePfxCertificateCommand, Concorde.LetsEncrypt.Contracts" "c4ff70ce-3926-4739-9e4a-7f9d487e1cf1" to 1 handlers took 2467 ms
Rebus.Pipeline.Send.SendOutgoingMessageStep DEBUG (Thread #40): Sending Concorde.LetsEncrypt.Contracts.Commands.MapCertificateCommand -> "letsencrypt-input"
Rebus.Pipeline.Receive.DispatchIncomingMessageStep DEBUG (Thread #40): Dispatching "Concorde.LetsEncrypt.Contracts.Commands.InstallCertificateCommand, Concorde.LetsEncrypt.Contracts" "f3ffff53-ed99-4c86-9464-8147b8fcfcc1" to 1 handlers took 1838 ms
Rebus.Pipeline.Send.SendOutgoingMessageStep DEBUG (Thread #51): Sending Concorde.LetsEncrypt.Contracts.Commands.MapCertificateCommand -> "letsencrypt-input"
Rebus.Pipeline.Receive.DispatchIncomingMessageStep DEBUG (Thread #51): Dispatching "Concorde.LetsEncrypt.Contracts.Commands.InstallCertificateCommand, Concorde.LetsEncrypt.Contracts" "86b38504-aa7c-4290-b6f4-1be0e1bb5f8d" to 1 handlers took 1782 ms
Rebus.Pipeline.Send.SendOutgoingMessageStep DEBUG (Thread #9): Sending Concorde.LetsEncrypt.Contracts.Events.CertificateInstalled -> "subscription/concorde_letsencrypt_contracts_events_certificateinstalled__concorde_letsencrypt_contracts"
Rebus.Pipeline.Receive.DispatchIncomingMessageStep DEBUG (Thread #9): Dispatching "Concorde.LetsEncrypt.Contracts.Commands.MapCertificateCommand, Concorde.LetsEncrypt.Contracts" "bf4d0030-0d6c-45a4-8527-e6ddfd48868a" to 1 handlers took 1791 ms
Rebus.Pipeline.Send.SendOutgoingMessageStep DEBUG (Thread #51): Sending Concorde.LetsEncrypt.Contracts.Events.CertificateInstalled -> "subscription/concorde_letsencrypt_contracts_events_certificateinstalled__concorde_letsencrypt_contracts"
Rebus.Pipeline.Receive.DispatchIncomingMessageStep DEBUG (Thread #51): Dispatching "Concorde.LetsEncrypt.Contracts.Commands.MapCertificateCommand, Concorde.LetsEncrypt.Contracts" "472ee554-b0bb-4b77-a248-b811a40339ca" to 1 handlers took 1868 ms
mookid8000 commented 6 years ago

Damn, that's weird!

Azure Service Bus has had issues with stuff like this in the past, but it should have been resolved a long time ago.

Can I see the relevant Rebus Configure.With(...) bits?

KennethJakobsen commented 6 years ago
  _bus = Configure.With(new LightInjectContainerAdapter(_container))
               .Transport(configurer => configurer.UseAzureServiceBus(_configurationSettings.ServiceBusConnectionString, Constants.Queues.LetsEncryptInput))
               .Routing(configurer =>
                {
                    configurer.TypeBased().MapAssemblyOf<AddNewHostnameCommand>(Constants.Queues.LetsEncryptInput);
                })
                .Options(configurer =>
                {
                    configurer.SetNumberOfWorkers(1);
                    configurer.SimpleRetryStrategy(secondLevelRetriesEnabled: false, maxDeliveryAttempts: 1);
                    configurer.SetMaxParallelism(1);
                })
                .Start();
mookid8000 commented 6 years ago

hmm, nothing suspicious about that.... any chance you can try and let Rebus re-create the input queue?

KennethJakobsen commented 6 years ago

@mookid8000 Apparently that solved my issue, i'll keep testing. as this service need to be rock solid.

mookid8000 commented 6 years ago

Oh well..... clearly a case of OMGWTFWHY?!?! 😫 but good to hear it solved your problem 🙂