Open guysteel opened 4 years ago
I am experiencing a very similar error. The setup is similar to how @guysteel has set it up above. On retries, I do not see a dependency call being logged in app insights. I know the retries are happening because I see the info logs that I am writing within the retry policy handler. The packages/SDK I am using are: .Net Core SDK 3.1.402 Microsoft.ApplicationInsights.AspNetCore (2.14.0) Microsoft.Extensions.Http.Polly (3.1.8)
The retry calls will likely not be captured if the retries are transparent to the caller. ApplicationInsights collects httpcalls by subscribing to the DiagnosticSource events emitted by a DiagnosticHandler in the HttpClient. If the retry is happening without going through the full pipeline of handlers (esp DiagnosticHandler), then the retries will not be captured by ApplicationInsights.
Don't think I can suggest a workaround at this stage, as the DiagnosticHandler is part of the HttpClient itself, so you may/may not have the ability to make sure it is called when using the Polly package to do the retry.
@cijothomas I tried the same scenario but with an older version of our service. Basically, I checked out an older branch of my service and ran the same scenario. In app insights I can see the following: | timestamp [UTC] | itemType | resultCode | |9/24/2020, 8:26:48.803 AM | dependency | 500 | | 9/24/2020, 8:26:32.737 AM | dependency | 500 | | 9/24/2020, 8:26:24.641 AM | dependency | 500 | | 9/24/2020, 8:26:20.550 AM | dependency | 500 | | 9/24/2020, 8:26:18.425 AM | dependency | 500|
I know these are my retry calls, as there are no other services running in this environment at the moment, and I can see from the timestamp that the calls are due to the exponential backoff algorithm I have configured for Polly. The packages I am using in this version are :
.Net Core SDK 2.2.0 Microsoft.ApplicationInsights.AspNetCore (2.7.1) Microsoft.Extensions.Http.Polly (2.2.0)
So it appears that this behaviour was available earlier but has since changed for some reason.
My app insights config is as follows:
var aiOptions = new Microsoft.ApplicationInsights.AspNetCore.Extensions.ApplicationInsightsServiceOptions();
aiOptions.EnableAdaptiveSampling = false;
aiOptions.InstrumentationKey = Configuration.GetSection("ApplicationInsights").GetValue
My app insights config has not changed with my new version, except that now we configure the connection string instead of the instrumentation key and of course we use the newer packages which I mentioned in my earlier message.
Also, Application Insights seem to log exceptions handled by Polly
Policy.Handle<HttpRequestException>()
After exception occurs, it is logged although IMO it should be regarded as handled exception. @cijothomas Any thoughts?
Also, Application Insights seem to log exceptions handled by Polly
Policy.Handle<HttpRequestException>()
After exception occurs, it is logged although IMO it should be regarded as handled exception. @cijothomas Any thoughts?
In the ExceptionTelemetry, what is the SDKVersion field value. If it is something starting with "rdd..", then its captured by the DependencyCollectionModule, which I would consider a bug and provide fix. If it is something starting with "il..", then the exception is captured by Msft.Extensions.Logging.ApplicationInsightsLoggingProvider. This works independent of whether DependencyCollectionModule is enabled or not. By default, this loggingprovider captures all logs Warning or above. This is by design and you can configure logging to not capture this by using instructions from https://docs.microsoft.com/en-us/azure/azure-monitor/app/ilogger#control-logging-level
@cijothomas SDK Version is azurefunctions: 3.0.14785.0
(forgot to mention that it is Azure Function, sorry for that)
@pzbyszynski Ok. For functions, you need to raise this issue in https://github.com/Azure/azure-webjobs-sdk repo. Functions + ApplicationInsights integration is in that repo.
(But if you see same behaviour outside of functions, let us know right here, we can help)
@cijothomas Any update on the original question? I had posted some tests that I did in the message below:
@cijothomas I tried the same scenario but with an older version of our service. Basically, I checked out an older branch of my service and ran the same scenario. In app insights I can see the following: | timestamp [UTC] | itemType | resultCode | |9/24/2020, 8:26:48.803 AM | dependency | 500 | | 9/24/2020, 8:26:32.737 AM | dependency | 500 | | 9/24/2020, 8:26:24.641 AM | dependency | 500 | | 9/24/2020, 8:26:20.550 AM | dependency | 500 | | 9/24/2020, 8:26:18.425 AM | dependency | 500|
I know these are my retry calls, as there are no other services running in this environment at the moment, and I can see from the timestamp that the calls are due to the exponential backoff algorithm I have configured for Polly. The packages I am using in this version are :
.Net Core SDK 2.2.0 Microsoft.ApplicationInsights.AspNetCore (2.7.1) Microsoft.Extensions.Http.Polly (2.2.0)
So it appears that this behaviour was available earlier but has since changed for some reason.
My app insights config is as follows: var aiOptions = new Microsoft.ApplicationInsights.AspNetCore.Extensions.ApplicationInsightsServiceOptions(); aiOptions.EnableAdaptiveSampling = false; aiOptions.InstrumentationKey = Configuration.GetSection("ApplicationInsights").GetValue("InstrumentationKey"); services.AddApplicationInsightsTelemetry(aiOptions);
My app insights config has not changed with my new version, except that now we configure the connection string instead of the instrumentation key and of course we use the newer packages which I mentioned in my earlier message.
@karun-verghese Is the only thing different is the SDK version,? need to double confirm this, as you said "checked out older branch of your service". Please share a minimal repro showing this behavior. Will investigate.
(From what I see, in .Net core, we always relied on DiagnosticSource callbacks to collect http dependency. In very old versions esp. .NET Framework, there were alternate options to collect http calls)
@cijothomas @guysteel I have invited you to be collaborators on a small repo that I set up to investigate this issue. The repo is simple: It contains a very basic .Net Core Web service with couple of healthcheck operations. On the 'main' branch, the setup is as following: .Net Core SDK (3.1.403) Microsoft.ApplicationInsights.AspNetCore (2.15.0) Microsoft.Extensions.Http.Polly (3.1.9)
The configuration for Polly and App insights can be found in the Startup class and is exactly the same as we have in our actual services.
On the branch 'OldSDK', I performed the same tests with the following setup: .Net Core SDK (2.2.208) Microsoft.ApplicationInsights.AspNetCore (2.7.1) Microsoft.Extensions.Http.Polly (2.2.0)
The tests performed are as follows: I have setup an 'isOperable' endpoint such that when I locally host it using kestrel, I can call the endpoint http://localhost:5001/isOperable
The isOperable endpoint intentionally calls a non-existing 'dependent api' which is a variation of some publicly available dummy test api's that I found in the following git repo: https://github.com/public-apis/public-apis#animals. Thus, the expected status code is a 404(Not Found)
The test is very simple and has just 3 steps: 1) Launch the service 2) Call the isOperable endpoint 3) Check the logs in app insights
When I run the test for the latest SDK and packages I see the following in my logs: | timestamp [UTC] | message | data | resultCode | itemType |
---|---|---|---|---|---|
10/29/2020, 7:53:17.863 AM | - | https://cat-fact.herokuapp.com/factsWrongUrl | 404 | dependency | |
10/29/2020, 7:53:19.447 AM | Delaying for 2.096 seconds and retrying. Retry attempt :1. Request uri : https://cat-fact.herokuapp.com/factsWrongUrl | - | - | trace | |
10/29/2020, 7:53:22.807 AM | Delaying for 4.084 seconds and retrying. Retry attempt 2 . Request uri : https://cat-fact.herokuapp.com/factsWrongUrl | - | - | trace | |
10/29/2020, 7:53:28.127 AM | Delaying for 8.08 seconds and retrying. Retry attempt :3. Request uri : https://cat-fact.herokuapp.com/factsWrongUrl | - | - | trace |
When I run the test for the Old 2.2 SDK with the old Polly and App Insights packages I see the following: | timestamp [UTC] | message | data | resultCode | itemType |
---|---|---|---|---|---|
10/29/2020, 9:09:32.865 AM | - | https://cat-fact.herokuapp.com/factsWrongUrl | 404 | dependency | |
10/29/2020, 9:09:34.313 AM | Delaying for 2.059 seconds and retrying. Retry attempt :1. Request uri : https://cat-fact.herokuapp.com/factsWrongUrl | - | - | trace | |
10/29/2020, 9:09:36.381 AM | - | https://cat-fact.herokuapp.com/factsWrongUrl | 404 | dependency | |
10/29/2020, 9:09:37.616 AM | Delaying for 4.02 seconds and retrying. Retry attempt 2 . Request uri : https://cat-fact.herokuapp.com/factsWrongUrl | - | - | trace | |
10/29/2020, 9:09:41.636 AM | - | https://cat-fact.herokuapp.com/factsWrongUrl | 404 | dependency | |
10/29/2020, 9:09:42.981 AM | Delaying for 8.096 seconds and retrying. Retry attempt :3. Request uri : https://cat-fact.herokuapp.com/factsWrongUrl | - | - | trace | |
10/29/2020, 9:09:51.089 AM | - | https://cat-fact.herokuapp.com/factsWrongUrl | 404 | dependency |
The app insights instance I am pointing to is a private instance I created.
As you can see, with a really stripped down service with nothing except the app insights and polly configuration, the behaviour is the same as reported earlier. With the old SDK I see the dependencies being logged for the retries. But with the latest .Net Core SDK that functionality has stopped working. The difference between the two branches is now really only the differences in SDK, packages and small code changes which you can easily check for yourself
@cijothomas I also gave this a try with the new .Net 5 SDK and packages. It's the same behaviour as described above. The configuration is as follows: .Net Core SDK (5.0.100) Microsoft.ApplicationInsights.AspNetCore (2.16.0) Microsoft.Extensions.Http.Polly (5.0.0) Even now, the dependencies due to retries are not logged.
There is no special handling in .NET5 which I am aware of, which would help with this issue.
I came across the same (or very similar?) issue. Running in Azure Functions v3 (netcore 3.1)
public override void Configure(IFunctionsHostBuilder builder)
{
builder.Services
.AddHttpClient<HttpClient>("HttpClient", client => client.Timeout = TimeSpan.FromSeconds(120)) // default overall request request timeout (includes all polly retries)
.AddPolicyHandler(GetRetryWithTimeoutPolicy());
}
static IAsyncPolicy<HttpResponseMessage> GetRetryWithTimeoutPolicy()
{
var retryPolicy = HttpPolicyExtensions
.HandleTransientHttpError()
.Or<TimeoutRejectedException>()
.WaitAndRetryAsync(3, retryAttempt => TimeSpan.FromSeconds(Math.Pow(2, retryAttempt)))
.WrapAsync(Policy.TimeoutAsync(30)); // per request timeout
return retryPolicy;
}
Some of the calls to an external API fail (usually time out after the 30 sec.). They are correctly retried by Polly. But AppInsights does not log another dependency call: You can actually see it in the gap between the failed call and the next one. Based on the trace logs I can see that the call is executed again and succeeds.
I would expect that the retry call shows up as another dependency.
The SDK version on the dependency telemetry is recorded as rdddsc:2.14.0-17971
Any updates? Maybe someone found a temporary solution for this issue?
Dear all,
It's been a long time since we've had any update on this issue? Has any investigation been done on this? We even provided a minimal repo (as asked) for you guys to investigate.
Looking forward to hearing from you guys. Thanks.
We experience the original issue described with the following versions: .Net Core SDK (5.0.202) Microsoft.ApplicationInsights.AspNetCore (2.17.0) Microsoft.Extensions.Http.Polly (3.1.7)
We experience the same behaviour: Microsoft.ApplicationInsights.AspNetCore (2.17.0) Microsoft.Extensions.Http.Polly (5.0.1)
It is disappointing to read through this thread, see the complete reproducible example and full detail provided by @karun-verghese and then the complete silence afterwards.
We experience the same behavior with Polly version 3.1.13 and Azure functions 3.0.13. Its frustrating that there is exist no workaround for this.
https://github.com/microsoft/ApplicationInsights-dotnet/issues/2556 same/duplicate issue.
The retry calls will likely not be captured if the retries are transparent to the caller. ApplicationInsights collects httpcalls by subscribing to the DiagnosticSource events emitted by a DiagnosticHandler in the HttpClient. If the retry is happening without going through the full pipeline of handlers (esp DiagnosticHandler), then the retries will not be captured by ApplicationInsights.
Don't think I can suggest a workaround at this stage, as the DiagnosticHandler is part of the HttpClient itself, so you may/may not have the ability to make sure it is called when using the Polly package to do the retry.
@cijothomas Should this be be a runtime bug then? Perhaps they can reorder the DiagnosticHandler? This issue combined with #2556 makes DependencyTelemetry pretty useless when there's retry involved.
@devlie No. I do not see an issue with Runtime. #2556 shows it is an application insights sdk behavior.
(Separate Note: OpenTelemetry is fixing it.)
We experience the same behavior: Microsoft.ApplicationInsights (2.21.0) Microsoft.Extensions.Http.Polly (6.0.11) Microsoft.NET.Sdk.Functions (4.1.1) Microsoft.NETCore.App (6.0.11)
Any new on resolving it?
Is there anyone who has found a workaround on this? Wrapping retries somehow and manually generating dependency telemetry?
Is there anyone who has found a workaround on this? Wrapping retries somehow and manually generating dependency telemetry?
We do it using DelegatingHandler
s (Add
sorting is important):
services.AddTransient<ApplicationInsightsPayloadTracerHandler>();
services
.AddHttpClient(..)
.AddPolicyHandler(..)
.AddHttpMessageHandler<ApplicationInsightsPayloadTracerHandler>();
The custom DelegatingHandler
to log requests and responses.
public class ApplicationInsightsPayloadTracerHandler : DelegatingHandler
{
private readonly IServiceProvider _serviceProvider;
private readonly ILogger<ApplicationInsightsPayloadTracerHandler> _logger;
public ApplicationInsightsPayloadTracerHandler(...) { ... }
protected override async Task<HttpResponseMessage> SendAsync(
HttpRequestMessage request,
CancellationToken cancellationToken)
{
HttpResponseMessage response = null;
var requestTimestamp = DateTime.UtcNow;
try
{
response = await base.SendAsync(request, cancellationToken);
}
finally
{
await TraceDependencyAsync(request, response, requestTimestamp);
}
return response;
}
private async Task TraceDependencyAsync(
HttpRequestMessage request,
HttpResponseMessage response,
DateTimeOffset traceTimestamp)
{
string traceCorrelationId = ActivityTraceHelper.BuildTraceCorrelationId();
await TraceRequestAsync(request, traceCorrelationId, traceTimestamp);
await TraceResponseAsync(request, response, traceCorrelationId);
}
private async Task TraceRequestAsync(
HttpRequestMessage request,
string traceCorrelationId,
DateTimeOffset traceTimestamp)
{
try
{
using var requestScope = _serviceProvider.CreateScope();
var telemetryClient = requestScope.ServiceProvider.GetRequiredService<TelemetryClient>();
...
var requestTelemetry = new TraceTelemetry
{
SeverityLevel = SeverityLevel.Verbose,
Timestamp = traceTimestamp,
};
requestTelemetry.Properties.Add("TraceCorrelationId", traceCorrelationId);
// log headers, status code, etc.
telemetryClient.TrackTrace(requestTelemetry);
}
catch (Exception ex) { }
}
private async Task TraceResponseAsync(...)
{
// very likely as above
}
}
Note: To get request headers, use request.Headers.ToString().Split(Environment.NewLine) because
ToString` method is internally overridden with a special implementation.
With this setup if the first attemp fails, Polly triggers the retry that will end up in this handler again, so the request will be logged twice.
Describe your environment. Describe any aspect of your environment relevant to the problem:
Since upgrading from v2.8.2, when using Polly to retry a failing Http request automatically only the first failing call to the dependency is being logged. Subsequent dependency calls are no longer appear to be logged.
Steps to reproduce. In ConfigureServices - configure app insights and setup a named http client:
In Configure add a failing call to demonstrate the behaviour:
What is the expected behavior? I would expect each call to the failing endpoint to be logged as a separate dependency call.
What is the actual behavior? I see only a single dependency call logged - the initial call. In the previous version of App Insights all of the calls were being logged.
Additional context. The above uses Polly 3.1.3