Azure / azure-functions-host

The host/runtime that powers Azure Functions
https://functions.azure.com
MIT License
1.93k stars 440 forks source link

Azure application Insights profiler on Azure functions - Profiling question #6453

Open harisubramaniam1 opened 4 years ago

harisubramaniam1 commented 4 years ago

Is your question related to a specific version? If so, please specify: Azure function V2

What language does your question apply to? (e.g. C#, JavaScript, Java, All) .Net core 2.1, C

Question

Is azure application insights profiler supported on Azure functions hosting a .net core app and the azure function hosted on an Isolated app service plan? If it is supported, is it supported for functions with all kinds of triggers?

v-anvari commented 3 years ago

@safihamid / @fabiocav , Kindly share more inputs for this issue, is Application insights profiler supported on apps hosted on isolated asp

safihamid commented 3 years ago

It should work fine for non consumption function apps. @pharring can you please confirm if anything extra is needed?

pharring commented 3 years ago

Application Insights Profiler is not supported in Azure Functions. At least not officially.

However, there is not much preventing it from working:

As far as I know, it'll work for any kind of trigger. The profiler looks for Application Insights requests. So if you can see Requests in Application Insights corresponding to your functions, the profiler should see them too.

pb-adrianbezzina commented 3 years ago

I would like to know if it would work in function app hosted on AKS:

https://docs.microsoft.com/en-us/azure/azure-functions/functions-kubernetes-keda

Here is some documentation that walks through a ASP.NET core application setup in linux:

https://docs.microsoft.com/en-us/azure/azure-monitor/app/profiler-aspnetcore-linux

I could not get the above to work in a AKS hosted function app (ie. Function App in a container)

I get the following error during startup:

System.InvalidOperationException: Unable to resolve service for type 'Microsoft.Azure.WebJobs.Script.IFileLoggingStatusManager' while attempting to activate 'Microsoft.Azure.WebJobs.Script.Diagnostics.HostFileLoggerProvider'.
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.CreateArgumentCallSites(Type serviceType, Type implementationType, CallSiteChain callSiteChain, ParameterInfo[] parameters, Boolean throwIfCallSiteNotFound)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.CreateConstructorCallSite(ResultCache lifetime, Type serviceType, Type implementationType, CallSiteChain callSiteChain)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.TryCreateExact(ServiceDescriptor descriptor, Type serviceType, CallSiteChain callSiteChain, Int32 slot)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.TryCreateEnumerable(Type serviceType, CallSiteChain callSiteChain)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.CreateCallSite(Type serviceType, CallSiteChain callSiteChain)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.<>c__DisplayClass7_0.<GetCallSite>b__0(Type type)
   at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd(TKey key, Func`2 valueFactory)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.GetCallSite(Type serviceType, CallSiteChain callSiteChain)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.CreateArgumentCallSites(Type serviceType, Type implementationType, CallSiteChain callSiteChain, ParameterInfo[] parameters, Boolean throwIfCallSiteNotFound)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.CreateConstructorCallSite(ResultCache lifetime, Type serviceType, Type implementationType, CallSiteChain callSiteChain)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.TryCreateExact(ServiceDescriptor descriptor, Type serviceType, CallSiteChain callSiteChain, Int32 slot)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.TryCreateExact(Type serviceType, CallSiteChain callSiteChain)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.CreateCallSite(Type serviceType, CallSiteChain callSiteChain)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.<>c__DisplayClass7_0.<GetCallSite>b__0(Type type)
   at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd(TKey key, Func`2 valueFactory)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.GetCallSite(Type serviceType, CallSiteChain callSiteChain)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.CreateArgumentCallSites(Type serviceType, Type implementationType, CallSiteChain callSiteChain, ParameterInfo[] parameters, Boolean throwIfCallSiteNotFound)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.CreateConstructorCallSite(ResultCache lifetime, Type serviceType, Type implementationType, CallSiteChain callSiteChain)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.TryCreateOpenGeneric(ServiceDescriptor descriptor, Type serviceType, CallSiteChain callSiteChain, Int32 slot)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.TryCreateOpenGeneric(Type serviceType, CallSiteChain callSiteChain)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.CreateCallSite(Type serviceType, CallSiteChain callSiteChain)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.<>c__DisplayClass7_0.<GetCallSite>b__0(Type type)
   at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd(TKey key, Func`2 valueFactory)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.GetCallSite(Type serviceType, CallSiteChain callSiteChain)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.ServiceProviderEngine.CreateServiceAccessor(Type serviceType)
   at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd(TKey key, Func`2 valueFactory)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.ServiceProviderEngine.GetService(Type serviceType, ServiceProviderEngineScope serviceProviderEngineScope)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.ServiceProviderEngine.GetService(Type serviceType)
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.GetService(Type serviceType)
   at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetService[T](IServiceProvider provider)
   at Microsoft.Extensions.DependencyInjection.ServiceProfilerExtensions.BuildServiceBase(IServiceCollection serviceCollection, Action`1 applyOptions, ILogger`1& logger, IConfiguration configuration)
   at Microsoft.Extensions.DependencyInjection.ServiceProfilerExtensions.AddServiceProfilerImp(IServiceCollection serviceCollection, Action`1 context, IConfiguration configuration, IServiceCollectionBuilder serviceCollectionBuilder, ILogger`1 logger)
GABRIELNGBTUC commented 9 months ago

I would like to revive this issue because if it is indeed working like what the docs are saying(https://learn.microsoft.com/en-us/azure/azure-monitor/profiler/profiler-azure-functions), there is something wrong with it.

Here is what I have tried so far. All attempts had the function app configuration set correctly:

Function app in-process (Linux + Windows)

The profiler did not work

Function app isolated (Linux)

The profiler did not work

Function app isolated (Linux) with AddServiceProfiler()

I can see the profiler is "present" but it never starts when a profiling session is launched

Function app isolated (Windows)

The profiler starts when the profiling session is launched. No profiler traces are generated

Function app isolated (Windows) with AddServiceProfiler()

The profiler starts when the profiling session is launched. No profiler traces are generated

On the Windows plan, I can see in kudu that the web job ApplicationInsightsProfiler3 is running but profiler traces are still not generated.

EDIT:

According to the WebJob logs, the profiler crashes 1-2 minutes into each profiling session with the following error:

[01/12/2024 08:25:44 > cd1a54: INFO] 01-12 08:25:44 Information: Stopping circular monitor and collecting etl snapshot.
[01/12/2024 08:25:48 > cd1a54: INFO] 01-12 08:25:48 Error: StopProfiler failed. Details: System.Runtime.InteropServices.COMException (0xE111004E): Merging of ETL files has failed (0x80070006).
[01/12/2024 08:25:48 > cd1a54: INFO]    at DiagnosticsHub.StandardCollector.Host.Interop.ICollectionSession.Stop()
[01/12/2024 08:25:48 > cd1a54: INFO]    at Microsoft.ServiceProfiler.Collectors.StandardCollectorBase.<StopAndCollectAsync>b__10_0()
[01/12/2024 08:25:48 > cd1a54: INFO]    at System.Threading.Tasks.Task`1.InnerInvoke()
[01/12/2024 08:25:48 > cd1a54: INFO]    at System.Threading.Tasks.Task.Execute()
[01/12/2024 08:25:48 > cd1a54: INFO] --- End of stack trace from previous location where exception was thrown ---
[01/12/2024 08:25:48 > cd1a54: INFO]    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
[01/12/2024 08:25:48 > cd1a54: INFO]    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
[01/12/2024 08:25:48 > cd1a54: INFO]    at Microsoft.ServiceProfiler.Collectors.DetailedTraceCollector.<StopCollectingAndSaveTrace>d__41.MoveNext()
[01/12/2024 08:25:48 > cd1a54: INFO] --- End of stack trace from previous location where exception was thrown ---
[01/12/2024 08:25:48 > cd1a54: INFO]    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
[01/12/2024 08:25:48 > cd1a54: INFO]    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
[01/12/2024 08:25:48 > cd1a54: INFO]    at Microsoft.ServiceProfiler.Collectors.DetailedTraceCollector.<StopCollectingAndUploadTraceAsync>d__39.MoveNext()
[01/12/2024 08:25:48 > cd1a54: INFO] --- End of stack trace from previous location where exception was thrown ---
[01/12/2024 08:25:48 > cd1a54: INFO]    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
[01/12/2024 08:25:48 > cd1a54: INFO]    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
[01/12/2024 08:25:48 > cd1a54: INFO]    at Microsoft.ServiceProfiler.Engine.<>c__DisplayClass1_1.<<RunAsync>b__2>d.MoveNext()
[01/12/2024 08:25:48 > cd1a54: INFO] --- End of stack trace from previous location where exception was thrown ---
[01/12/2024 08:25:48 > cd1a54: INFO]    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
[01/12/2024 08:25:48 > cd1a54: INFO]    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
[01/12/2024 08:25:48 > cd1a54: INFO]    at Microsoft.ServiceProfiler.Agent.Orchestration.OrchestratorAgent.<TraceAsyncOperation>d__18.MoveNext

So either the Azure Monitor documentation needs to be changed to remove Azure Function from the supported platforms of the profiler or something in this repository broke support.

pharring commented 8 months ago

Right now, I believe the documentation is correct for Azure Functions. The pre-requisites state, accurately, that an App Service plan is required. This solution relies on the same continuous WebJob agent (ApplicationInsightsProfiler3) that is used in Web Apps. It works for Windows only (not Linux). You shouldn't reference the any of the Profiler NuGets (Microsoft.ApplicationInsights.Profiler.*), nor should you call "AddServiceProfiler()". AFAIK, it doesn't matter whether the application is in-process or isolated.

Regarding the "Merging of ETL files has failed" error: 0x80070006 (invalid handle). This is an internal error reported from the profiler service (StandardCollectorService) which is responsible for coordinating and multiplexing profiler sessions in the App Service sandbox. I believe this error is often raised when ETW sessions are exhausted. IIRC, there's a limit of 64 concurrent ETW sessions on Windows servers and it's quite easy to hit that limit with multiple apps running in the same service plan all trying to profile at the same time. Unfortunately, there's nothing much you can do about it from within the sandbox, but a common troubleshooting step is to try the same thing on a newly created App Service Plan running just one application. Sometimes, just scaling the App Service Plan up and back down is enough to move onto a "clean" host VM where there are free ETW sessions. Hope that helps.

@pb-adrianbezzina, for AKS applications, I believe the documentation at https://learn.microsoft.com/azure/azure-monitor/profiler/profiler-containers is appropriate.

GABRIELNGBTUC commented 8 months ago

Right now, I believe the documentation is correct for Azure Functions. The pre-requisites state, accurately, that an App Service plan is required. This solution relies on the same continuous WebJob agent (ApplicationInsightsProfiler3) that is used in Web Apps. It works for Windows only (not Linux). You shouldn't reference the any of the Profiler NuGets (Microsoft.ApplicationInsights.Profiler.*), nor should you call "AddServiceProfiler()". AFAIK, it doesn't matter whether the application is in-process or isolated.

Regarding the "Merging of ETL files has failed" error: 0x80070006 (invalid handle). This is an internal error reported from the profiler service (StandardCollectorService) which is responsible for coordinating and multiplexing profiler sessions in the App Service sandbox. I believe this error is often raised when ETW sessions are exhausted. IIRC, there's a limit of 64 concurrent ETW sessions on Windows servers and it's quite easy to hit that limit with multiple apps running in the same service plan all trying to profile at the same time. Unfortunately, there's nothing much you can do about it from within the sandbox, but a common troubleshooting step is to try the same thing on a newly created App Service Plan running just one application. Sometimes, just scaling the App Service Plan up and back down is enough to move onto a "clean" host VM where there are free ETW sessions. Hope that helps.

@pb-adrianbezzina, for AKS applications, I believe the documentation at https://learn.microsoft.com/azure/azure-monitor/profiler/profiler-containers is appropriate.

Unfortunately, this does not seem to change anything.

After many tries, the profiler webjob keep outputting the same error whenever the profiling session is started.

I still will insist that the documentation is wrong or at least not tested since their last update because:

The setup tested is also very basic:

A quick test would have shown the profiler team that their implementation do not work.

buzzaII commented 1 month ago

Right now, I believe the documentation is correct for Azure Functions. The pre-requisites state, accurately, that an App Service plan is required. This solution relies on the same continuous WebJob agent (ApplicationInsightsProfiler3) that is used in Web Apps. It works for Windows only (not Linux). You shouldn't reference the any of the Profiler NuGets (Microsoft.ApplicationInsights.Profiler.*), nor should you call "AddServiceProfiler()". AFAIK, it doesn't matter whether the application is in-process or isolated.

Regarding the "Merging of ETL files has failed" error: 0x80070006 (invalid handle). This is an internal error reported from the profiler service (StandardCollectorService) which is responsible for coordinating and multiplexing profiler sessions in the App Service sandbox. I believe this error is often raised when ETW sessions are exhausted. IIRC, there's a limit of 64 concurrent ETW sessions on Windows servers and it's quite easy to hit that limit with multiple apps running in the same service plan all trying to profile at the same time. Unfortunately, there's nothing much you can do about it from within the sandbox, but a common troubleshooting step is to try the same thing on a newly created App Service Plan running just one application. Sometimes, just scaling the App Service Plan up and back down is enough to move onto a "clean" host VM where there are free ETW sessions. Hope that helps.

@pb-adrianbezzina, for AKS applications, I believe the documentation at https://learn.microsoft.com/azure/azure-monitor/profiler/profiler-containers is appropriate.

Unfortunately this doesn't work with the following exception in startup - for Function Apps hosted in AKS:

[2024-08-23T04:58:07.418Z] A host error has occurred during startup operation '49f9f68b-bf12-492d-8f17-680efbfaceda'. [2024-08-23T04:58:07.418Z] func: Invalid host services. Microsoft.Azure.WebJobs.Script.WebHost: The following service registrations did not match the expected services: [2024-08-23T04:58:07.418Z] [Invalid] ServiceType: Microsoft.Extensions.Hosting.IHostedService, Lifetime: Singleton, ImplementationType: ServiceProfiler.EventPipe.Logging.TelemetryTrackerBackgroundService, Microsoft.ApplicationInsights.Profiler.Logging, Version=2.6.2408.1601, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a [2024-08-23T04:58:07.418Z] [Invalid] ServiceType: Microsoft.Extensions.Hosting.IHostedService, Lifetime: Singleton, ImplementationType: Microsoft.ApplicationInsights.Profiler.AspNetCore.ServiceProfilerBackgroundService, Microsoft.ApplicationInsights.Profiler.AspNetCore, Version=2.6.2408.1601, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a [2024-08-23T04:58:07.418Z] [Invalid] ServiceType: Microsoft.Extensions.Hosting.IHostedService, Lifetime: Singleton, ImplementationType: Microsoft.ApplicationInsights.Profiler.Core.TraceScavenger.TraceScavengerService, Microsoft.ApplicationInsights.Profiler.Core, Version=2.6.2408.1601, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a. Value cannot be null. (Parameter 'provider') [2024-08-23T04:58:07.460Z] Host startup operation has been canceled [2024-08-23T04:58:07.460Z] Initialization cancellation requested by runtime. [2024-08-23T04:58:07.462Z] Stopping host... [2024-08-23T04:58:07.462Z] Host shutdown completed.