Open jimduff-usds opened 2 years ago
Occurred at 10:39:21 on 11/20 with this error immediately prior:
Singleton lock renewal failed for blob 'pdhprod-functionapp-candidate/Host.Functions.emailScheduleEngine.Listener' with error code 403: AuthorizationFailure. The last successful renewal completed at 2021-11-20T15:38:51.507Z (30251 milliseconds ago) with a duration of 6 milliseconds. The lease period was 60000 milliseconds.
And on 11/21 with this error immediately prior:
Singleton lock renewal failed for blob 'pdhprod-functionapp-candidate/Host.Functions.scheduled-remote-connection-check.Listener' with error code 403: AuthorizationFailure. The last successful renewal completed at 2021-11-21T09:59:53.055Z (30252 milliseconds ago) with a duration of 5 milliseconds. The lease period was 60000 milliseconds.
We continue to see this exception on a regular basis - slightly more than once per day. Not sure I see a pattern.
This request is not authorized to perform this operation.
Azure.RequestFailedException:
at Azure.Storage.Blobs.BlobRestClient+<RenewLeaseAsync>d__42.MoveNext (Azure.Storage.Blobs, Version=12.9.0.0, Culture=neutral, PublicKeyToken=92742159e12e44c8)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.ConfiguredTaskAwaitable`1+ConfiguredTaskAwaiter.GetResult (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Azure.Storage.Blobs.Specialized.BlobLeaseClient+<RenewInternal>d__29.MoveNext (Azure.Storage.Blobs, Version=12.9.0.0, Culture=neutral, PublicKeyToken=92742159e12e44c8)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.ConfiguredTaskAwaitable`1+ConfiguredTaskAwaiter.GetResult (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Azure.Storage.Blobs.Specialized.BlobLeaseClient+<RenewAsync>d__28.MoveNext (Azure.Storage.Blobs, Version=12.9.0.0, Culture=neutral, PublicKeyToken=92742159e12e44c8)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.Azure.WebJobs.Script.BlobLeaseDistributedLockManager+SingletonLockHandle+<RenewAsync>d__17.MoveNext (Microsoft.Azure.WebJobs.Script, Version=3.3.0.0, Culture=neutral, PublicKeyToken=null: /src/azure-functions-host/src/WebJobs.Script/Host/DistributedLockManagers/BlobLeaseDistributedLockManager.cs:334)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.Azure.WebJobs.Host.SingletonManager+RenewLeaseCommand+<ExecuteAsync>d__4.MoveNext (Microsoft.Azure.WebJobs.Host, Version=3.0.31.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35: C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Singleton\SingletonManager.cs:337)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.Azure.WebJobs.Host.Timers.TaskSeriesTimer+<RunAsync>d__14.MoveNext (Microsoft.Azure.WebJobs.Host, Version=3.0.31.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35: C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Timers\TaskSeriesTimer.cs:147)
More examples:
Here's something interesting: Staging got the exact same StorageException as Prod at 10:28pm on 12/31. Prod exception was at 10:28:30 and Staging was at 10:28:43. Coincidence?
The timings of the other 'not authorized' exceptions on 12/31 and 1/1 did not line up that way.
@jimduff-usds it doesn't look like the Azure client libraries do retries in the presence of 401/403 errors. Do you think that's something we can build into our code to deal with such hiccups?
This exception occurred at 1:45am ET on Thurs Nov 18: Azure.RequestFailedException: at Azure.Storage.Blobs.BlobRestClient+d__42.MoveNext
It does not appear to have occurred in our code stack. I see no tasks nor actions occurring around that time, in the database, nor any data failures.
This does seem to fit with a pattern of glitchiness occurring in the middle of the night recently. Various forms of this same exception have occurred dozens of times in the last month, sometimes with
Microsoft.WindowsAzure.Storage.StorageException
and other times withAzure.RequestFailedException
Regarding this specific occurrence: There was no activity in the Azure pdhprod-functionapp 'Activity Log' at that time. Immediately prior to the exception, this trace message appears in the logs:
Singleton lock renewal failed for blob 'pdhprod-functionapp-candidate/Host.Functions.scheduled-remote-connection-check.Listener' with error code 403: AuthorizationFailure. The last successful renewal completed at 2021-11-18T06:44:57.055Z (30204 milliseconds ago) with a duration of 8 milliseconds. The lease period was 60000 milliseconds.
From the Exception:
Exception Call stack: