IoT Edge Logging and Monitoring Solution (ELMS) is an architecture and sample cloud workflow that enables automated retrieval of logs and metrics from IoT Edge devices
MIT License
42
stars
21
forks
source link
Collect Metrics function stops working from time to time #3
- [x ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)
Minimal steps to reproduce
Deploy monitoring architecture. Metrics Collector module sends metrics as D2C messages, which are routed to Event Hub, then Collect Metrics function is triggered and it sends data to Log Analytics workspace.
From time to time the Collect Metrics function is not able to successfully finish the execution. It keeps receiving events, but it doesn't finish processing. The problem is not visible in Azure Portal under Function -> Monitor -> Invocation, Error count is 0. And it looks like the function gets stuck. This can take hours, last time lasted more than 12 hours, which means that during 12 hours we didn't have Insights Metrics. And then suddenly, without any intervention from our side, it starts working again. This behavior happened multiple times, and in multiple environments.
Any log messages given by the failure
In Application Insights traces we are able to see the following two errors:
exception occurred : System.AggregateException: One or more errors occurred. (The SSL connection could not be established, see inner exception.)
---> System.Net.Http.HttpRequestException: The SSL connection could not be established, see inner exception.
---> System.ComponentModel.Win32Exception (0x8009030D): The credentials supplied to the package were not recognized
at System.Net.SSPIWrapper.AcquireCredentialsHandle(SSPIInterface secModule, String package, CredentialUse intent, SCHANNEL_CRED scc)
at System.Net.Security.SslStreamPal.AcquireCredentialsHandle(CredentialUse credUsage, SCHANNEL_CRED secureCredential)
at System.Net.Security.SslStreamPal.AcquireCredentialsHandle(X509Certificate certificate, SslProtocols protocols, EncryptionPolicy policy, Boolean isServer)
at System.Net.Security.SecureChannel.AcquireClientCredentials(Byte[]& thumbPrint)
at System.Net.Security.SecureChannel.GenerateToken(Byte[] input, Int32 offset, Int32 count, Byte[]& output)
at System.Net.Security.SecureChannel.NextMessage(Byte[] incoming, Int32 offset, Int32 count)
at System.Net.Security.SslStream.StartSendBlob(Byte[] incoming, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.Security.SslStream.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.Security.SslStream.StartReadFrame(Byte[] buffer, Int32 readBytes, AsyncProtocolRequest asyncRequest)
at System.Net.Security.SslStream.PartialFrameCallback(AsyncProtocolRequest asyncRequest)
--- End of stack trace from previous location where exception was thrown ---
at System.Net.Security.SslStream.ThrowIfExceptional()
at System.Net.Security.SslStream.InternalEndProcessAuthentication(LazyAsyncResult lazyResult)
at System.Net.Security.SslStream.EndProcessAuthentication(IAsyncResult result)
at System.Net.Security.SslStream.EndAuthenticateAsClient(IAsyncResult asyncResult)
at System.Net.Security.SslStream.<>c.<AuthenticateAsClientAsync>b__65_1(IAsyncResult iar)
at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)
--- End of stack trace from previous location where exception was thrown ---
at System.Net.Http.ConnectHelper.EstablishSslConnectionAsyncCore(Stream stream, SslClientAuthenticationOptions sslOptions, CancellationToken cancellationToken)
--- End of inner exception stack trace ---
at System.Net.Http.ConnectHelper.EstablishSslConnectionAsyncCore(Stream stream, SslClientAuthenticationOptions sslOptions, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.ConnectAsync(HttpRequestMessage request, Boolean allowHttp2, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.CreateHttp11ConnectionAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.GetHttpConnectionAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)
at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.DiagnosticsHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.HttpClient.FinishSendAsyncBuffered(Task`1 sendTask, HttpRequestMessage request, CancellationTokenSource cts, Boolean disposeCts)
--- End of inner exception stack trace ---
at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)
at System.Threading.Tasks.Task`1.get_Result()
at FunctionApp.CertificateGenerator.CertGenerator.RegisterWithOms(X509Certificate2 cert, String AgentGuid, String logAnalyticsWorkspaceDomainPrefixOms) in /home/runner/work/iotedge-logging-and-monitoring-solution/iotedge-logging-and-monitoring-solution/FunctionApp/FunctionApp/CertificateGenerator/CertGenertor.cs:line 381
at FunctionApp.CertificateGenerator.CertGenerator.RegisterWithOmsWithBasicRetryAsync(X509Certificate2 cert, String AgentGuid, String logAnalyticsWorkspaceDomainPrefixOms) in /home/runner/work/iotedge-logging-and-monitoring-solution/iotedge-logging-and-monitoring-solution/FunctionApp/FunctionApp/CertificateGenerator/CertGenertor.cs:line 404
Registering agent with OMS failed (are the Log Analytics Workspace ID and Key correct?) : System.AggregateException: One or more errors occurred. (The SSL connection could not be established, see inner exception.)
---> System.Net.Http.HttpRequestException: The SSL connection could not be established, see inner exception.
---> System.ComponentModel.Win32Exception (0x8009030D): The credentials supplied to the package were not recognized
at System.Net.SSPIWrapper.AcquireCredentialsHandle(SSPIInterface secModule, String package, CredentialUse intent, SCHANNEL_CRED scc)
at System.Net.Security.SslStreamPal.AcquireCredentialsHandle(CredentialUse credUsage, SCHANNEL_CRED secureCredential)
at System.Net.Security.SslStreamPal.AcquireCredentialsHandle(X509Certificate certificate, SslProtocols protocols, EncryptionPolicy policy, Boolean isServer)
at System.Net.Security.SecureChannel.AcquireClientCredentials(Byte[]& thumbPrint)
at System.Net.Security.SecureChannel.GenerateToken(Byte[] input, Int32 offset, Int32 count, Byte[]& output)
at System.Net.Security.SecureChannel.NextMessage(Byte[] incoming, Int32 offset, Int32 count)
at System.Net.Security.SslStream.StartSendBlob(Byte[] incoming, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.Security.SslStream.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.Security.SslStream.StartReadFrame(Byte[] buffer, Int32 readBytes, AsyncProtocolRequest asyncRequest)
at System.Net.Security.SslStream.PartialFrameCallback(AsyncProtocolRequest asyncRequest)
--- End of stack trace from previous location where exception was thrown ---
at System.Net.Security.SslStream.ThrowIfExceptional()
at System.Net.Security.SslStream.InternalEndProcessAuthentication(LazyAsyncResult lazyResult)
at System.Net.Security.SslStream.EndProcessAuthentication(IAsyncResult result)
at System.Net.Security.SslStream.EndAuthenticateAsClient(IAsyncResult asyncResult)
at System.Net.Security.SslStream.<>c.<AuthenticateAsClientAsync>b__65_1(IAsyncResult iar)
at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)
--- End of stack trace from previous location where exception was thrown ---
at System.Net.Http.ConnectHelper.EstablishSslConnectionAsyncCore(Stream stream, SslClientAuthenticationOptions sslOptions, CancellationToken cancellationToken)
--- End of inner exception stack trace ---
at System.Net.Http.ConnectHelper.EstablishSslConnectionAsyncCore(Stream stream, SslClientAuthenticationOptions sslOptions, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.ConnectAsync(HttpRequestMessage request, Boolean allowHttp2, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.CreateHttp11ConnectionAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.GetHttpConnectionAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)
at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.DiagnosticsHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.HttpClient.FinishSendAsyncBuffered(Task`1 sendTask, HttpRequestMessage request, CancellationTokenSource cts, Boolean disposeCts)
--- End of inner exception stack trace ---
at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)
at System.Threading.Tasks.Task`1.get_Result()
at FunctionApp.CertificateGenerator.CertGenerator.RegisterWithOms(X509Certificate2 cert, String AgentGuid, String logAnalyticsWorkspaceDomainPrefixOms) in /home/runner/work/iotedge-logging-and-monitoring-solution/iotedge-logging-and-monitoring-solution/FunctionApp/FunctionApp/CertificateGenerator/CertGenertor.cs:line 381
at FunctionApp.CertificateGenerator.CertGenerator.RegisterWithOmsWithBasicRetryAsync(X509Certificate2 cert, String AgentGuid, String logAnalyticsWorkspaceDomainPrefixOms) in /home/runner/work/iotedge-logging-and-monitoring-solution/iotedge-logging-and-monitoring-solution/FunctionApp/FunctionApp/CertificateGenerator/CertGenertor.cs:line 404
at FunctionApp.CertificateGenerator.CertGenerator.RegisterAgentWithOMS(String logAnalyticsWorkspaceDomainPrefixOms) in /home/runner/work/iotedge-logging-and-monitoring-solution/iotedge-logging-and-monitoring-solution/FunctionApp/FunctionApp/CertificateGenerator/CertGenertor.cs:line 462
The second message is misleading. Log Analytics Workspace ID and Key are correct, because eventually the function starts to work again, without any change.
Expected/desired behavior
CollectMetrics function is able to send Insights Metrics to Log Analytics workspace.
OS and Version?
Function App with Operating System: Windows, Runtime version: 3.2.0.0
This issue is for a: (mark with an
x
)Minimal steps to reproduce
From time to time the Collect Metrics function is not able to successfully finish the execution. It keeps receiving events, but it doesn't finish processing. The problem is not visible in Azure Portal under Function -> Monitor -> Invocation, Error count is 0. And it looks like the function gets stuck. This can take hours, last time lasted more than 12 hours, which means that during 12 hours we didn't have Insights Metrics. And then suddenly, without any intervention from our side, it starts working again. This behavior happened multiple times, and in multiple environments.
Any log messages given by the failure
In Application Insights traces we are able to see the following two errors:
The second message is misleading. Log Analytics Workspace ID and Key are correct, because eventually the function starts to work again, without any change.
Expected/desired behavior
OS and Version?
Mention any other details that might be useful