Open karun-verghese opened 2 years ago
Those look like they're coming from this line: https://github.com/dotnet/aspnetcore/blob/c6a270f7ddcc45e4ba6884ef72ffc28a54d76e51/src/Security/Authentication/JwtBearer/src/JwtBearerHandler.cs#L95
The CancellationToken passed in is HttpContext.RequestAborted which fires if the client disconnects mid operation. If you enable debug level server logs you should see other disconnect related logs around this. Do you have clients that are prone to disconnecting?
@Tratcher thanks for the reply, I am still trying to determine this at the time of typing this message. Thanks for the tip on the debug logs, I will come back if I can find anything. Do you know if there's a way I can simulate a client that disconnects in the middle of the api call?
You can make the API call with a client like HttpClient or Curl and then cancel it.
@Tratcher thanks, I did something similar, but I had to adjust the response timeout on my client to 20ms to cancel it and see this error. I am now able to reproduce the error. So am I right in understanding that this is expected behaviour right? The cancellation token is being invoked as the client has cancelled the request?
Can you point me to any articles/resources on how to better handle this scenario? Maybe I need to configure a custom delegate for the OnAuthenticationFailed event while registering my authentication scheme?
This is expected behavior, and the exception is mainly noise in the app since the client has already disconnected. I'll talk to @blowdart and we'll see what we can do to avoid that.
Less noise is good
Thanks for contacting us.
We're moving this issue to the .NET 7 Planning
milestone for future evaluation / consideration. We would like to keep this around to collect more feedback, which can help us with prioritizing this work. We will re-evaluate this issue, during our next planning meeting(s).
If we later determine, that the issue has no community involvement, or it's very rare and low-impact issue, we will close it - so that the team can focus on more important and high impact issues.
To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.
Options: A) Don't pass the RequestAborted CT to GetConfigurationAsync. Config is cached so even if the client disconnects, completing this operation will still have value. We should make sure there's another timeout though. B) IdentityModel has some changes in flight to fall back to a last-known-good configuration so validation could proceed even if this call failed.
Thanks for contacting us.
We're moving this issue to the .NET 8 Planning
milestone for future evaluation / consideration. We would like to keep this around to collect more feedback, which can help us with prioritizing this work. We will re-evaluate this issue, during our next planning meeting(s).
If we later determine, that the issue has no community involvement, or it's very rare and low-impact issue, we will close it - so that the team can focus on more important and high impact issues.
To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.
This piece of code is not locked properly:
public override void RequestRefresh()
{
DateTimeOffset now = DateTimeOffset.UtcNow;
if (_isFirstRefreshRequest)
{
_syncAfter = now;
_isFirstRefreshRequest = false;
}
else if (now >= DateTimeUtil.Add(_lastRefresh.UtcDateTime, RefreshInterval))
{
_syncAfter = now;
}
}
So if the condition (now >= DateTimeUtil.Add(_lastRefresh.UtcDateTime, RefreshInterval))
is satisfied => _syncAfter = now
This reset will allow awaiting threads that are trying to invoke the function GetConfigurationAsync
to call await _configRetriever.GetConfigurationAsync
.
@chungvodim that doesn't seem related to this issue? Please open a new issue in AzureAd.
I think I see how this could cause duplicate refreshes, but does that break anything?
Hi @Tratcher , we're facing this issue
at System.Threading.CancellationToken.ThrowOperationCanceledException()
at System.Threading.SemaphoreSlim.WaitUntilCountOrTimeoutAsync(TaskNode asyncWaiter, Int32 millisecondsTimeout, CancellationToken cancellationToken)
at Microsoft.IdentityModel.Protocols.ConfigurationManager`1.GetConfigurationAsync(CancellationToken cancel)
at Microsoft.AspNetCore.Authentication.JwtBearer.JwtBearerHandler.HandleAuthenticateAsync()
and
System.OperationCanceledException: The operation was canceled.
at void System.Threading.CancellationToken.ThrowOperationCanceledException()
at async Task<bool> System.Threading.SemaphoreSlim.WaitUntilCountOrTimeoutAsync(TaskNode asyncWaiter, int millisecondsTimeout, CancellationToken cancellationToken)
at async Task<T> Microsoft.IdentityModel.Protocols.ConfigurationManager<T>.GetConfigurationAsync(CancellationToken cancel)
at async Task<AuthenticateResult> Microsoft.AspNetCore.Authentication.JwtBearer.JwtBearerHandler.HandleAuthenticateAsync() x 2
at async Task<AuthenticateResult> Microsoft.AspNetCore.Authentication.AuthenticationHandler<TOptions>.AuthenticateAsync()
at async Task<AuthenticateResult> Microsoft.AspNetCore.Authentication.AuthenticationService.AuthenticateAsync(HttpContext context, string scheme)
at async Task Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.Invoke(HttpContext context)
at async Task Microsoft.AspNetCore.ResponseCompression.ResponseCompressionMiddleware.Invoke(HttpContext context)
So you're experiencing the original issue which is normally caused by a canceled client request, but how does the thread safety affect that? Worst case you'd get multiple refreshes happening in parallel, they'd all still be subject to the cancellation issue.
If there are too many awaiting requests due to the lock await _refreshLock.WaitAsync(cancel).ConfigureAwait(false);
of the function GetConfigurationAsync => System.Threading.SemaphoreSlim.WaitUntilCountOrTimeoutAsync will throw OperationCanceledException.
If the _syncAfter
variable is not reset to now
by the function RequestRefresh so it doesn't matter at all. But if _syncAfter is continuously reset to now
(due to failed authorized requests) => Every threads that accesses the _refreshLock
will invoke await _configRetriever.GetConfigurationAsync
.
Assuming each external call takes 1 second => the 60th thread must wait 60 seconds.
RequestRefresh is supposed to update _syncAfter. Any request after that gets queued waiting for new config. That's by design, though there's some plans to allow requests to continue with the last known good config.
One place I see that could be improved is here:
https://github.com/AzureAD/azure-activedirectory-identitymodel-extensions-for-dotnet/blob/9e6f90a9a8032107af6f5c851e9d257f4c6c8fdc/src/Microsoft.IdentityModel.Protocols/Configuration/ConfigurationManager.cs#L136-L143
Right now it assumes the _refreshLock is acquired quickly and now
still current. However, if the refresh takes too long and/or there's lock contention, then now
could be outdated. `now should be updated inside the lock (line 136), and again after the refresh finishes (line 143). Once the refresh completes that should help all other requests waiting on the lock to quickly exit.
The only other thing that could be changed is to protect RequestRefresh with _refreshLock, but that risks blocking threads that don't really need to wait for the refresh to happen.
now should be updated inside the lock (line 136)
<= I absolutely agree with this great idea.The only other thing that could be changed is to protect RequestRefresh with _refreshLock, but that risks blocking threads that don't really need to wait for the refresh to happen
<= Thanks for your call outCan you open an issue in https://github.com/AzureAD/azure-activedirectory-identitymodel-extensions-for-dotnet/?
Can you open an issue in https://github.com/AzureAD/azure-activedirectory-identitymodel-extensions-for-dotnet/?
Sure.
We are facing the same issue as mentioned here. Having multiple endpoints that get called simultaneously, not able yet to get an intentional reproduction. However from deployed environment able to get these logs
EROR : Microsoft.AspNetCore.Authentication.JwtBearer.JwtBearerHandler
Exception occurred while processing message.
System.OperationCanceledException: The operation was canceled.
at System.Threading.CancellationToken.ThrowOperationCanceledException()
at System.Threading.SemaphoreSlim.WaitUntilCountOrTimeoutAsync(TaskNode asyncWaiter, Int32 millisecondsTimeout, CancellationToken cancellationToken)
at Microsoft.IdentityModel.Protocols.ConfigurationManager`1.GetConfigurationAsync(CancellationToken cancel)
at Microsoft.AspNetCore.Authentication.JwtBearer.JwtBearerHandler.HandleAuthenticateAsync()
EROR : Microsoft.AspNetCore.Authentication.JwtBearer.JwtBearerHandler
Exception occurred while processing message.
System.OperationCanceledException: The operation was canceled.
at System.Threading.CancellationToken.ThrowOperationCanceledException()
at System.Threading.SemaphoreSlim.WaitUntilCountOrTimeoutAsync(TaskNode asyncWaiter, Int32 millisecondsTimeout, CancellationToken cancellationToken)
at Microsoft.IdentityModel.Protocols.ConfigurationManager`1.GetConfigurationAsync(CancellationToken cancel)
at Microsoft.AspNetCore.Authentication.JwtBearer.JwtBearerHandler.HandleAuthenticateAsync()
EROR : Microsoft.AspNetCore.Server.Kestrel
Connection id "0HMOHKGT911S1", Request id "0HMOHKGT911S1:00000004": An unhandled exception was thrown by the application.
System.OperationCanceledException: The operation was canceled.
at System.Threading.CancellationToken.ThrowOperationCanceledException()
at System.Threading.SemaphoreSlim.WaitUntilCountOrTimeoutAsync(TaskNode asyncWaiter, Int32 millisecondsTimeout, CancellationToken cancellationToken)
at Microsoft.IdentityModel.Protocols.ConfigurationManager`1.GetConfigurationAsync(CancellationToken cancel)
at Microsoft.AspNetCore.Authentication.JwtBearer.JwtBearerHandler.HandleAuthenticateAsync()
at Microsoft.AspNetCore.Authentication.JwtBearer.JwtBearerHandler.HandleAuthenticateAsync()
at Microsoft.AspNetCore.Authentication.AuthenticationHandler`1.AuthenticateAsync()
at Microsoft.AspNetCore.Authentication.AuthenticationService.AuthenticateAsync(HttpContext context, String scheme)
at Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.Invoke(HttpContext context)
at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol.ProcessRequests[TContext](IHttpApplication`1 application)
EROR : Microsoft.AspNetCore.Server.Kestrel
Connection id "0HMOHKGT911SH", Request id "0HMOHKGT911SH:00000001": An unhandled exception was thrown by the application.
System.OperationCanceledException: The operation was canceled.
at System.Threading.CancellationToken.ThrowOperationCanceledException()
at System.Threading.SemaphoreSlim.WaitUntilCountOrTimeoutAsync(TaskNode asyncWaiter, Int32 millisecondsTimeout, CancellationToken cancellationToken)
at Microsoft.IdentityModel.Protocols.ConfigurationManager`1.GetConfigurationAsync(CancellationToken cancel)
at Microsoft.AspNetCore.Authentication.JwtBearer.JwtBearerHandler.HandleAuthenticateAsync()
at Microsoft.AspNetCore.Authentication.JwtBearer.JwtBearerHandler.HandleAuthenticateAsync()
at Microsoft.AspNetCore.Authentication.AuthenticationHandler`1.AuthenticateAsync()
at Microsoft.AspNetCore.Authentication.AuthenticationService.AuthenticateAsync(HttpContext context, String scheme)
at Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.Invoke(HttpContext context)
at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol.ProcessRequests[TContext](IHttpApplication`1 application)
EROR : Microsoft.AspNetCore.Authentication.JwtBearer.JwtBearerHandler
Exception occurred while processing message.
System.OperationCanceledException: The operation was canceled.
at System.Threading.CancellationToken.ThrowOperationCanceledException()
at System.Threading.SemaphoreSlim.WaitUntilCountOrTimeoutAsync(TaskNode asyncWaiter, Int32 millisecondsTimeout, CancellationToken cancellationToken)
at Microsoft.IdentityModel.Protocols.ConfigurationManager`1.GetConfigurationAsync(CancellationToken cancel)
at Microsoft.AspNetCore.Authentication.JwtBearer.JwtBearerHandler.HandleAuthenticateAsync()
EROR : Microsoft.AspNetCore.Server.Kestrel
Connection id "0HMOHKGT911SI", Request id "0HMOHKGT911SI:00000001": An unhandled exception was thrown by the application.
System.OperationCanceledException: The operation was canceled.
at System.Threading.CancellationToken.ThrowOperationCanceledException()
at System.Threading.SemaphoreSlim.WaitUntilCountOrTimeoutAsync(TaskNode asyncWaiter, Int32 millisecondsTimeout, CancellationToken cancellationToken)
at Microsoft.IdentityModel.Protocols.ConfigurationManager`1.GetConfigurationAsync(CancellationToken cancel)
at Microsoft.AspNetCore.Authentication.JwtBearer.JwtBearerHandler.HandleAuthenticateAsync()
at Microsoft.AspNetCore.Authentication.JwtBearer.JwtBearerHandler.HandleAuthenticateAsync()
at Microsoft.AspNetCore.Authentication.AuthenticationHandler`1.AuthenticateAsync()
at Microsoft.AspNetCore.Authentication.AuthenticationService.AuthenticateAsync(HttpContext context, String scheme)
at Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.Invoke(HttpContext context)
at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol.ProcessRequests[TContext](IHttpApplication`1 application)
As far as I understand from above thread, this seems to be noise (so can be ignored).
Is there a suggested way of catching these exceptions or filtering them out? As a work-around until this issue is gone, we would like to keep our logs clean of this noise where possible.
We also seem to be affected by this.
System.Threading.Tasks.TaskCanceledException:
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.ConfiguredTaskAwaitable+ConfiguredTaskAwaiter.GetResult (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.IdentityModel.Protocols.ConfigurationManager`1+<GetConfigurationAsync>d__14.MoveNext (Microsoft.IdentityModel.Protocols, Version=6.25.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.AspNetCore.Authentication.JwtBearer.JwtBearerHandler+<HandleAuthenticateAsync>d__6.MoveNext (Microsoft.AspNetCore.Authentication.JwtBearer, Version=5.0.12.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.AspNetCore.Authentication.JwtBearer.JwtBearerHandler+<HandleAuthenticateAsync>d__6.MoveNext (Microsoft.AspNetCore.Authentication.JwtBearer, Version=5.0.12.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.AspNetCore.Authentication.AuthenticationHandler`1+<AuthenticateAsync>d__48.MoveNext (Microsoft.AspNetCore.Authentication, Version=6.0.0.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)
I understand that it's only noise, but it triggers an http server errors alert for us in Azure. The linked issue is closed and I am not sure why. Has the issue been mitigated or is there any known workaround?
In another issue (https://github.com/dotnet/aspnetcore/issues/43220) @Tratcher mentioned mitigation is to pre-init the configuration at startup. Can you explain how do you do that please?
I mean the comment here: https://github.com/dotnet/aspnetcore/issues/43220#issuecomment-1218553342
The mitigation is to pre-init and load the ConfigurationManager during startup.
@Tratcher could you help us please?
I mean the comment here: #43220 (comment)
The mitigation is to pre-init and load the ConfigurationManager during startup.
@Tratcher could you help us please?
Could you share your configuration?
We just use the defaults in Startup.cs
:
services.AddJwtBearerDefaultAuthentication(settings.IdentityServerAudience, Environment);
In the settings it has 3 things passed: "ValidAudience", "TokenIssuer" and "IdentityServerBaseUrl".
Is that what you were looking for @chungvodim ?
AddJwtBearerDefaultAuthentication
Something likes this
@chungvodim Ah, gotcha, that was an extension method we have there, nevermind.
Found the config part:
options.Authority = settings.IdentityServerBaseUrl;
options.RequireHttpsMetadata = environment.IsProduction();
options.TokenValidationParameters = new TokenValidationParameters
{
ValidateAudience = true,
ValidAudience = settings.ValidAudience,
ValidateIssuer = true,
ValidIssuer = settings.TokenIssuer,
RequireSignedTokens = true,
ClockSkew = TimeSpan.FromMinutes(1),
ValidateLifetime = true,
ValidateIssuerSigningKey = true
};
I think we use the defaults mostly for this particular app. What should we do?
Should we initialize ConfigurationManager
in a way you have in your screenshot?
GetConfigurationAsync
If not provided, then one will be created using the MetadataAddress and Backchannel properties. So you should explicitly initialize ConfigurationManager
.
Basically, this issue could not be fixed thoroughly, you can only mitigate it by increasing RefreshInterval
and AutomaticRefreshInterval
to an appropriate number (I'm setting 5 minutes and 12 minutes accordingly).
Creating the ConfigurationManager at startup only helps with the cancellation issue unless you actually call it while starting the server. That way it's pulling from the local cache per-request until a refresh is needed.
var configManager = new ConfigurationManager....;
await configManager.GetConfigurationAsync(); // Wait for the first config download
options.ConfigurationManager = configManager;
We are experiencing this issue in production (Azure) as well on at least 2 different Dot Net 6.0 Web API's, but its not noise. 3th party developers are now contacting us, because our API's can now take up to 60 seconds to respond. I have clear evidence (profiler trace from Application Insights) pointing towards this issue but it is not on startup but randomly happens when running. We have multiple API's running and this issue is now affecting us and 3th party users of our API's.
RefreshInterval
and AutomaticRefreshInterval
mentioned by @chungvodim aswell?Here's a workaround I use and the issue has practically been resolved. Place it before app.UseAuthentication(); in your Program file and no exception is going to bother you anymore :)
class ExpectedOperationCanceledExceptionSupressionMiddleware
{
private readonly RequestDelegate next;
private readonly ILogger<ExpectedOperationCanceledExceptionSupressionMiddleware> logger;
public ExpectedOperationCanceledExceptionSupressionMiddleware(RequestDelegate next, ILogger<ExpectedOperationCanceledExceptionSupressionMiddleware> logger)
{
this.next = next;
this.logger = logger;
}
public async Task Invoke(HttpContext context)
{
try
{
await this.next(context);
}
catch (OperationCanceledException e) when (IsExpectedDueToAnUnresolvedBugInIdentityModel(e, context))
{
this.logger.LogWarning(
"Compensating for a bug in IdentityModel. Request to '{RequestPath}' has already been aborted, so nothing to worry about.",
context.Request.Path);
}
}
/// <summary>
/// https://github.com/dotnet/aspnetcore/issues/38467
/// </summary>
private static bool IsExpectedDueToAnUnresolvedBugInIdentityModel(OperationCanceledException e, HttpContext context) =>
e.CancellationToken == context.RequestAborted &&
e.StackTrace != null &&
e.StackTrace.Contains("GetConfigurationAsync");
}
Here's a workaround I use and the issue has practically been resolved. Place it before app.UseAuthentication(); in your Program file and no exception is going to bother you anymore :)
class ExpectedOperationCanceledExceptionSupressionMiddleware { private readonly RequestDelegate next; private readonly ILogger<ExpectedOperationCanceledExceptionSupressionMiddleware> logger; public ExpectedOperationCanceledExceptionSupressionMiddleware(RequestDelegate next, ILogger<ExpectedOperationCanceledExceptionSupressionMiddleware> logger) { this.next = next; this.logger = logger; } public async Task Invoke(HttpContext context) { try { await this.next(context); } catch (OperationCanceledException e) when (IsExpectedDueToAnUnresolvedBugInIdentityModel(e, context)) { this.logger.LogWarning( "Compensating for a bug in IdentityModel. Request to '{RequestPath}' has already been aborted, so nothing to worry about.", context.Request.Path); } } /// <summary> /// https://github.com/dotnet/aspnetcore/issues/38467 /// </summary> private static bool IsExpectedDueToAnUnresolvedBugInIdentityModel(OperationCanceledException e, HttpContext context) => e.CancellationToken == context.RequestAborted && e.StackTrace != null && e.StackTrace.Contains("GetConfigurationAsync"); }
Thx for sharing this, but we actually had the delays in our code execution. It is not just delayed ghost logs, but an actual issue.
Dear all,
We have a .NET Core Web API service which makes use of the Authentication middleware readily available with .NET 5.0. In our service we have registered 4 different jwt bearer authentication schemes on startup of the service, (why? - that's because we are in the process of migrating from two older identity providers to two newer ones. For example, we are migrating from Azure AD V1 auth endpoints to the Microsoft Identity Platform (V2)).
In a number of our deployments we see TaskCanceled exceptions with the below stack trace, when certain api's are called. The api's are all protected by an [Authorize] attribute which makes use of the aforementioned auth schemes that are registered at startup
Details of the service: SDK version: .NET 5.0.401 Base docker image: mcr.microsoft.com/dotnet/aspnet:5.0.10
We always seem to have 2 of these exceptions thrown simultaeneously:
And the second exception stack trace:
The troublesome thing is that I can't seem to reproduce this error intentionally. We set up suitable load tests to test the affected api's, but we don't seem to be able to reproduce the failures as part of the load tests, but the errors keep happening in the same environments. As far as I can tell we have configured the middleware correctly as for the majority of our calls we are able to authenticate and authorize correctly (we have automated tests testing various scenarios).
But in certain sporadic cases, we get these exceptions.