Open maciejsobocinski opened 2 years ago
Another update on this. We are also having trouble with the rule above to collect a gc dump when working-set goes above 70%. GC dump work fine for us under low memory usage but under high it "seems like" it takes so long pausing the dotnet process in main container that our load balancer kills the task - health checks are set to 45 seconds of failures to relaunch. Looking at our perf logs during that time memory usage is 6678/8192mb (186mb of that is dotnet monitor) and cpu usage is 1004/2048 (only 1 of that is used by dotnet monitor). The weird thing is per the dotnet-monitor logs below gcdump times out after 3 seconds but it looks like it is still running inside the main container causing time outs/container restart.
Any tips/ideas/guidance on how to capture a gc dump in prod (we are trying to diagnose memory spikes)?
Here are dotnet-monitor logs of the gc dump rule being triggered, timing out in like 3 seconds then for the next minute or so main container requests are timing out/failing (presumably because a gc dump is still running?) and eventually the container(s) get killed by our load balancer:
{
"Timestamp": "2022-04-19T03:04:46.3862671Z",
"EventId": 36,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "Collection rule 'HighMemoryRule' trigger 'EventCounter' completed.",
"State": {
"Message": "Collection rule 'HighMemoryRule' trigger 'EventCounter' completed.",
"ruleName": "HighMemoryRule",
"triggerType": "EventCounter",
"{OriginalFormat}": "Collection rule '{ruleName}' trigger '{triggerType}' completed."
},
"Scopes": [
{
"Message": "TargetProcessId:10 TargetRuntimeInstanceCookie:1af38edab5774d948ed55b3310dc80d1 CollectionRuleName:HighMemoryRule",
"TargetProcessId": "10",
"TargetRuntimeInstanceCookie": "1af38edab5774d948ed55b3310dc80d1",
"CollectionRuleName": "HighMemoryRule"
},
{
"Message": "CollectionRuleTriggerType:EventCounter",
"CollectionRuleTriggerType": "EventCounter"
}
]
}
{
"Timestamp": "2022-04-19T03:04:46.5268863Z",
"EventId": 33,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "Collection rule 'HighMemoryRule' action 'CollectGCDump' started.",
"State": {
"Message": "Collection rule 'HighMemoryRule' action 'CollectGCDump' started.",
"ruleName": "HighMemoryRule",
"actionType": "CollectGCDump",
"{OriginalFormat}": "Collection rule '{ruleName}' action '{actionType}' started."
},
"Scopes": [
{
"Message": "TargetProcessId:10 TargetRuntimeInstanceCookie:1af38edab5774d948ed55b3310dc80d1 CollectionRuleName:HighMemoryRule",
"TargetProcessId": "10",
"TargetRuntimeInstanceCookie": "1af38edab5774d948ed55b3310dc80d1",
"CollectionRuleName": "HighMemoryRule"
},
{
"Message": "CollectionRuleTriggerType:EventCounter",
"CollectionRuleTriggerType": "EventCounter"
},
{
"Message": "CollectionRuleActionType:CollectGCDump CollectionRuleActionIndex:0",
"CollectionRuleActionType": "CollectGCDump",
"CollectionRuleActionIndex": 0
}
]
}
{
"Timestamp": "2022-04-19T03:04:46.5665901Z",
"EventId": 33,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "Collection rule 'HighMemoryRule' action 'Execute' started.",
"State": {
"Message": "Collection rule 'HighMemoryRule' action 'Execute' started.",
"ruleName": "HighMemoryRule",
"actionType": "Execute",
"{OriginalFormat}": "Collection rule '{ruleName}' action '{actionType}' started."
},
"Scopes": [
{
"Message": "TargetProcessId:10 TargetRuntimeInstanceCookie:1af38edab5774d948ed55b3310dc80d1 CollectionRuleName:HighMemoryRule",
"TargetProcessId": "10",
"TargetRuntimeInstanceCookie": "1af38edab5774d948ed55b3310dc80d1",
"CollectionRuleName": "HighMemoryRule"
},
{
"Message": "CollectionRuleTriggerType:EventCounter",
"CollectionRuleTriggerType": "EventCounter"
},
{
"Message": "CollectionRuleActionType:Execute CollectionRuleActionIndex:1",
"CollectionRuleActionType": "Execute",
"CollectionRuleActionIndex": 1
}
]
}
{
"Timestamp": "2022-04-19T03:04:49.5671423Z",
"EventId": 52,
"LogLevel": "Warning",
"Category": "Microsoft.Diagnostics.Tools.Monitor.ServerEndpointInfoSource",
"Message": "Unexpected timeout from process 10. Process will no longer be monitored.",
"State": {
"Message": "Unexpected timeout from process 10. Process will no longer be monitored.",
"processId": "10",
"{OriginalFormat}": "Unexpected timeout from process {processId}. Process will no longer be monitored."
},
"Scopes": []
}
{
"Timestamp": "2022-04-19T03:04:49.5714281Z",
"EventId": 44,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "Stopping collection rules.",
"State": {
"Message": "Stopping collection rules.",
"{OriginalFormat}": "Stopping collection rules."
},
"Scopes": [
{
"Message": "TargetProcessId:10 TargetRuntimeInstanceCookie:1af38edab5774d948ed55b3310dc80d1",
"TargetProcessId": "10",
"TargetRuntimeInstanceCookie": "1af38edab5774d948ed55b3310dc80d1"
}
]
}
{
"Timestamp": "2022-04-19T03:04:49.5779787Z",
"EventId": 38,
"LogLevel": "Error",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "Collection rule 'HighMemoryRule' action 'CollectGCDump' failed.",
"Exception": "System.Threading.Tasks.TaskCanceledException: A task was canceled. at Microsoft.Diagnostics.Tools.Monitor.TaskExtensions.WithCancellation(Task task, CancellationToken token) in /_/src/Tools/dotnet-monitor/TaskExtensions.cs:line 37 at Microsoft.Diagnostics.Tools.Monitor.TaskExtensions.WithCancellation[T](Task`1 task, CancellationToken token) in /_/src/Tools/dotnet-monitor/TaskExtensions.cs:line 48 at Microsoft.Diagnostics.Tools.Monitor.CollectionRules.Actions.CollectionRuleActionBase`1.WaitForCompletionAsync(CancellationToken token) in /_/src/Tools/dotnet-monitor/CollectionRules/Actions/CollectionRuleActionBase.cs:line 64 at Microsoft.Diagnostics.Tools.Monitor.CollectionRules.Actions.ActionListExecutor.WaitForCompletion(CollectionRuleContext context, Action startCallback, IDictionary`2 allResults, ICollectionRuleAction action, CollectionRuleActionOptions actionOption, CancellationToken cancellationToken) in /_/src/Tools/dotnet-monitor/CollectionRules/ActionListExecutor.cs:line 160 at Microsoft.Diagnostics.Tools.Monitor.CollectionRules.Actions.ActionListExecutor.WaitForCompletion(CollectionRuleContext context, Action startCallback, IDictionary`2 allResults, ActionCompletionEntry entry, CancellationToken cancellationToken) in /_/src/Tools/dotnet-monitor/CollectionRules/ActionListExecutor.cs:line 177",
"State": {
"Message": "Collection rule 'HighMemoryRule' action 'CollectGCDump' failed.",
"ruleName": "HighMemoryRule",
"actionType": "CollectGCDump",
"{OriginalFormat}": "Collection rule '{ruleName}' action '{actionType}' failed."
},
"Scopes": [
{
"Message": "TargetProcessId:10 TargetRuntimeInstanceCookie:1af38edab5774d948ed55b3310dc80d1 CollectionRuleName:HighMemoryRule",
"TargetProcessId": "10",
"TargetRuntimeInstanceCookie": "1af38edab5774d948ed55b3310dc80d1",
"CollectionRuleName": "HighMemoryRule"
},
{
"Message": "CollectionRuleTriggerType:EventCounter",
"CollectionRuleTriggerType": "EventCounter"
},
{
"Message": "CollectionRuleActionType:Execute CollectionRuleActionIndex:1",
"CollectionRuleActionType": "Execute",
"CollectionRuleActionIndex": 1
}
]
}
{
"Timestamp": "2022-04-19T03:04:49.5977629Z",
"EventId": 2,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Monitoring.WebApi.EgressOperation",
"Message": "Request canceled.",
"State": {
"Message": "Request canceled.",
"{OriginalFormat}": "Request canceled."
},
"Scopes": [
{
"Message": "TargetProcessId:10 TargetRuntimeInstanceCookie:1af38edab5774d948ed55b3310dc80d1 CollectionRuleName:HighMemoryRule",
"TargetProcessId": "10",
"TargetRuntimeInstanceCookie": "1af38edab5774d948ed55b3310dc80d1",
"CollectionRuleName": "HighMemoryRule"
},
{
"Message": "CollectionRuleTriggerType:EventCounter",
"CollectionRuleTriggerType": "EventCounter"
},
{
"Message": "CollectionRuleActionType:CollectGCDump CollectionRuleActionIndex:0",
"CollectionRuleActionType": "CollectGCDump",
"CollectionRuleActionIndex": 0
},
{
"Message": "ArtifactType:gcdump ArtifactSource_ProcessId:10 ArtifactSource_RuntimeInstanceCookie:1af38edab5774d948ed55b3310dc80d1",
"ArtifactType": "gcdump",
"ArtifactSource_ProcessId": "10",
"ArtifactSource_RuntimeInstanceCookie": "1af38edab5774d948ed55b3310dc80d1"
}
]
}
{
"Timestamp": "2022-04-19T03:04:49.6023431Z",
"EventId": 38,
"LogLevel": "Error",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "Collection rule 'HighMemoryRule' action 'Execute' failed.",
"Exception": "System.Threading.Tasks.TaskCanceledException: A task was canceled. at Microsoft.Diagnostics.Tools.Monitor.TaskExtensions.WithCancellation(Task task, CancellationToken token) in /_/src/Tools/dotnet-monitor/TaskExtensions.cs:line 37 at Microsoft.Diagnostics.Tools.Monitor.TaskExtensions.WithCancellation[T](Task`1 task, CancellationToken token) in /_/src/Tools/dotnet-monitor/TaskExtensions.cs:line 48 at Microsoft.Diagnostics.Tools.Monitor.CollectionRules.Actions.CollectionRuleActionBase`1.WaitForCompletionAsync(CancellationToken token) in /_/src/Tools/dotnet-monitor/CollectionRules/Actions/CollectionRuleActionBase.cs:line 64 at Microsoft.Diagnostics.Tools.Monitor.CollectionRules.Actions.ActionListExecutor.WaitForCompletion(CollectionRuleContext context, Action startCallback, IDictionary`2 allResults, ICollectionRuleAction action, CollectionRuleActionOptions actionOption, CancellationToken cancellationToken) in /_/src/Tools/dotnet-monitor/CollectionRules/ActionListExecutor.cs:line 160 at Microsoft.Diagnostics.Tools.Monitor.CollectionRules.Actions.ActionListExecutor.WaitForCompletion(CollectionRuleContext context, Action startCallback, IDictionary`2 allResults, ActionCompletionEntry entry, CancellationToken cancellationToken) in /_/src/Tools/dotnet-monitor/CollectionRules/ActionListExecutor.cs:line 177 at Microsoft.Diagnostics.Tools.Monitor.CollectionRules.Actions.ActionListExecutor.WaitForCompletion(CollectionRuleContext context, Action startCallback, IDictionary`2 allResults, ActionCompletionEntry entry, CancellationToken cancellationToken) in /_/src/Tools/dotnet-monitor/CollectionRules/ActionListExecutor.cs:line 185 at Microsoft.Diagnostics.Tools.Monitor.CollectionRules.Actions.ActionListExecutor.ExecuteActions(CollectionRuleContext context, Action startCallback, CancellationToken cancellationToken) in /_/src/Tools/dotnet-monitor/CollectionRules/ActionListExecutor.cs:line 77",
"State": {
"Message": "Collection rule 'HighMemoryRule' action 'Execute' failed.",
"ruleName": "HighMemoryRule",
"actionType": "Execute",
"{OriginalFormat}": "Collection rule '{ruleName}' action '{actionType}' failed."
},
"Scopes": [
{
"Message": "TargetProcessId:10 TargetRuntimeInstanceCookie:1af38edab5774d948ed55b3310dc80d1 CollectionRuleName:HighMemoryRule",
"TargetProcessId": "10",
"TargetRuntimeInstanceCookie": "1af38edab5774d948ed55b3310dc80d1",
"CollectionRuleName": "HighMemoryRule"
},
{
"Message": "CollectionRuleTriggerType:EventCounter",
"CollectionRuleTriggerType": "EventCounter"
},
{
"Message": "CollectionRuleActionType:Execute CollectionRuleActionIndex:1",
"CollectionRuleActionType": "Execute",
"CollectionRuleActionIndex": 1
}
]
}
{
"Timestamp": "2022-04-19T03:04:49.6056195Z",
"EventId": 45,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "All collection rules have stopped.",
"State": {
"Message": "All collection rules have stopped.",
"{OriginalFormat}": "All collection rules have stopped."
},
"Scopes": [
{
"Message": "TargetProcessId:10 TargetRuntimeInstanceCookie:1af38edab5774d948ed55b3310dc80d1",
"TargetProcessId": "10",
"TargetRuntimeInstanceCookie": "1af38edab5774d948ed55b3310dc80d1"
}
]
}
{
"Timestamp": "2022-04-19T03:05:34.4132777Z",
"EventId": 40,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "Starting collection rules.",
"State": {
"Message": "Starting collection rules.",
"{OriginalFormat}": "Starting collection rules."
},
"Scopes": [
{
"Message": "TargetProcessId:10 TargetRuntimeInstanceCookie:1af38edab5774d948ed55b3310dc80d1",
"TargetProcessId": "10",
"TargetRuntimeInstanceCookie": "1af38edab5774d948ed55b3310dc80d1"
}
]
}
{
"Timestamp": "2022-04-19T03:05:34.4137868Z",
"EventId": 29,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "Collection rule 'HighMemoryRule' started.",
"State": {
"Message": "Collection rule 'HighMemoryRule' started.",
"ruleName": "HighMemoryRule",
"{OriginalFormat}": "Collection rule '{ruleName}' started."
},
"Scopes": [
{
"Message": "TargetProcessId:10 TargetRuntimeInstanceCookie:1af38edab5774d948ed55b3310dc80d1 CollectionRuleName:HighMemoryRule",
"TargetProcessId": "10",
"TargetRuntimeInstanceCookie": "1af38edab5774d948ed55b3310dc80d1",
"CollectionRuleName": "HighMemoryRule"
}
]
}
{
"Timestamp": "2022-04-19T03:05:34.4142123Z",
"EventId": 35,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "Collection rule 'HighMemoryRule' trigger 'EventCounter' started.",
"State": {
"Message": "Collection rule 'HighMemoryRule' trigger 'EventCounter' started.",
"ruleName": "HighMemoryRule",
"triggerType": "EventCounter",
"{OriginalFormat}": "Collection rule '{ruleName}' trigger '{triggerType}' started."
},
"Scopes": [
{
"Message": "TargetProcessId:10 TargetRuntimeInstanceCookie:1af38edab5774d948ed55b3310dc80d1 CollectionRuleName:HighMemoryRule",
"TargetProcessId": "10",
"TargetRuntimeInstanceCookie": "1af38edab5774d948ed55b3310dc80d1",
"CollectionRuleName": "HighMemoryRule"
},
{
"Message": "CollectionRuleTriggerType:EventCounter",
"CollectionRuleTriggerType": "EventCounter"
}
]
}
{
"Timestamp": "2022-04-19T03:05:34.4164694Z",
"EventId": 32,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "All collection rules started.",
"State": {
"Message": "All collection rules started.",
"{OriginalFormat}": "All collection rules started."
},
"Scopes": [
{
"Message": "TargetProcessId:10 TargetRuntimeInstanceCookie:1af38edab5774d948ed55b3310dc80d1",
"TargetProcessId": "10",
"TargetRuntimeInstanceCookie": "1af38edab5774d948ed55b3310dc80d1"
}
]
}
{
"Timestamp": "2022-04-19T03:05:58.8419003Z",
"EventId": 52,
"LogLevel": "Warning",
"Category": "Microsoft.Diagnostics.Tools.Monitor.ServerEndpointInfoSource",
"Message": "Unexpected timeout from process 10. Process will no longer be monitored.",
"State": {
"Message": "Unexpected timeout from process 10. Process will no longer be monitored.",
"processId": "10",
"{OriginalFormat}": "Unexpected timeout from process {processId}. Process will no longer be monitored."
},
"Scopes": []
}
{
"Timestamp": "2022-04-19T03:05:58.8422906Z",
"EventId": 44,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "Stopping collection rules.",
"State": {
"Message": "Stopping collection rules.",
"{OriginalFormat}": "Stopping collection rules."
},
"Scopes": [
{
"Message": "TargetProcessId:10 TargetRuntimeInstanceCookie:1af38edab5774d948ed55b3310dc80d1",
"TargetProcessId": "10",
"TargetRuntimeInstanceCookie": "1af38edab5774d948ed55b3310dc80d1"
}
]
}
{
"Timestamp": "2022-04-19T03:05:58.8516468Z",
"EventId": 45,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "All collection rules have stopped.",
"State": {
"Message": "All collection rules have stopped.",
"{OriginalFormat}": "All collection rules have stopped."
},
"Scopes": [
{
"Message": "TargetProcessId:10 TargetRuntimeInstanceCookie:1af38edab5774d948ed55b3310dc80d1",
"TargetProcessId": "10",
"TargetRuntimeInstanceCookie": "1af38edab5774d948ed55b3310dc80d1"
}
]
}
{
"Timestamp": "2022-04-19T03:06:09.9610855Z",
"EventId": 0,
"LogLevel": "Information",
"Category": "Microsoft.Hosting.Lifetime",
"Message": "Application is shutting down...",
"State": {
"Message": "Application is shutting down...",
"{OriginalFormat}": "Application is shutting down..."
},
"Scopes": []
}
Glancing at the dotnet monitor code if feels like "CheckEndpointAsync" below ends up cancelling the gc dump and rule collection if the process is unresponsive for more than 2.5 seconds (because a gc dump is taking place) then rules are restarted and the same gc dump rule is invoked again (further pausing the runtime)? The first check against operationTrackerService only returns true if a trace or a dump are running...but not a gcdump:
private async Task<bool> CheckEndpointAsync(EndpointInfo info, CancellationToken token)
{
// If a dump operation is in progress, the runtime is likely to not respond to
// diagnostic requests. Do not check for responsiveness while the dump operation
// is in progress.
if (_operationTrackerService?.IsExecutingOperation(info) == true)
{
return true;
}
using var timeoutSource = new CancellationTokenSource();
using var linkedSource = CancellationTokenSource.CreateLinkedTokenSource(token, timeoutSource.Token);
try
{
timeoutSource.CancelAfter(PruneWaitForConnectionTimeout);
await info.Endpoint.WaitForConnectionAsync(linkedSource.Token).ConfigureAwait(false);
}
catch (OperationCanceledException) when (timeoutSource.IsCancellationRequested)
{
_logger?.EndpointTimeout(info.ProcessId.ToString(System.Globalization.CultureInfo.InvariantCulture));
return false;
}
catch (Exception ex) when (ex is not OperationCanceledException)
{
return false;
}
return true;
}
@maciejsobocinski, thanks for reporting this and digging into it. Have a few questions for you:
Our set up is a single dotnetmonitor side car inside each launched task with many tasks being launched. It stops working for us on some tasks after a few hours with the following error:
Is this occurring randomly, or only after an impactful operation happens such as capturing a gcdump?
The first check against operationTrackerService only returns true if a trace or a dump are running...but not a gcdump
My understanding is that the diagnostic communication channel should still be viable while a gcdump is being collected. @josalem, do you know if there is some kind of lock that is held while GC data is collected that would prevent the diagnostic communication channel from responding? If this is expected, we can add gcdump operations to the tracking service in dotnet monitor
to prevent pruning of the diagnostic connection. However, even with the pruning (which disposes the stream), I would still expect the runtime instance to try to reestablish the diagnostic connection again, which it does not seem to do (or maybe we don't see that because it is still collecting GC data).
@jander-msft both I believe. My initial ticket we had 4 tasks (each with it's own dotnet monitor) started up. Every 5 minutes hitting the /livemetrics endpoint . About 12 hrs in dotnet-monitor started failing inside one of them. There was no gc dump ran at the time per the logs (have it set to auto gc dump at 70%) memory usage was around 17% with cpu basically idle. This is the log entry:
{
"Timestamp": "2022-04-15T06:50:02.0842638Z",
"EventId": 2,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.LiveJwtBearerHandler",
"Message": "Successfully validated the token.",
"State": {
"Message": "Successfully validated the token.",
"{OriginalFormat}": "Successfully validated the token."
},
"Scopes": [
{
"Message": "SpanId:be2da2ba85a57e19, TraceId:8bd2f78ca29aec12951eabd01e36a7ef, ParentId:f2394daea1f71aab",
"SpanId": "be2da2ba85a57e19",
"TraceId": "8bd2f78ca29aec12951eabd01e36a7ef",
"ParentId": "f2394daea1f71aab"
},
{
"Message": "ConnectionId:0HMGU731FOFDE",
"ConnectionId": "0HMGU731FOFDE"
},
{
"Message": "RequestPath:/livemetrics RequestId:0HMGU731FOFDE:00000002",
"RequestId": "0HMGU731FOFDE:00000002",
"RequestPath": "/livemetrics"
}
]
}
{
"Timestamp": "2022-04-15T06:50:03.0758604Z",
"EventId": 52,
"LogLevel": "Warning",
"Category": "Microsoft.Diagnostics.Tools.Monitor.ServerEndpointInfoSource",
"Message": "Unexpected timeout from process 6. Process will no longer be monitored.",
"State": {
"Message": "Unexpected timeout from process 6. Process will no longer be monitored.",
"processId": "6",
"{OriginalFormat}": "Unexpected timeout from process {processId}. Process will no longer be monitored."
},
"Scopes": []
}
subsequent requests to /livemetrics:
{
"Timestamp": "2022-04-15T06:55:01.6363199Z",
"EventId": 1,
"LogLevel": "Error",
"Category": "Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController",
"Message": "Request failed.",
"Exception": "System.ArgumentException: Unable to discover a target process. at Microsoft.Diagnostics.Monitoring.WebApi.DiagnosticServices.GetProcessAsync(DiagProcessFilter processFilterConfig, CancellationToken token) in /_/src/Microsoft.Diagnostics.Monitoring.WebApi/DiagnosticServices.cs:line 100 at Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.<>c__DisplayClass33_0`1.<<InvokeForProcess>b__0>d.MoveNext() in /_/src/Microsoft.Diagnostics.Monitoring.WebApi/Controllers/DiagController.cs:line 713 --- End of stack trace from previous location --- at Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagControllerExtensions.InvokeService[T](ControllerBase controller, Func`1 serviceCall, ILogger logger) in /_/src/Microsoft.Diagnostics.Monitoring.WebApi/Controllers/DiagControllerExtensions.cs:line 91",
"State": {
"Message": "Request failed.",
"{OriginalFormat}": "Request failed."
},
"Scopes": [
{
"Message": "SpanId:5f73f4ec6a4c2a06, TraceId:6e3bec22534dca3eed9ae13c8150dc0c, ParentId:0d6726492bd0e999",
"SpanId": "5f73f4ec6a4c2a06",
"TraceId": "6e3bec22534dca3eed9ae13c8150dc0c",
"ParentId": "0d6726492bd0e999"
},
{
"Message": "ConnectionId:0HMGU731FOFDF",
"ConnectionId": "0HMGU731FOFDF"
},
{
"Message": "RequestPath:/livemetrics RequestId:0HMGU731FOFDF:00000002",
"RequestId": "0HMGU731FOFDF:00000002",
"RequestPath": "/livemetrics"
},
{
"Message": "Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.CaptureMetrics (Microsoft.Diagnostics.Monitoring.WebApi)",
"ActionId": "cc79e4d4-794e-481f-8083-fb3f3c7b5ca5",
"ActionName": "Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.CaptureMetrics (Microsoft.Diagnostics.Monitoring.WebApi)"
},
{
"Message": "ArtifactType:livemetrics",
"ArtifactType": "livemetrics"
}
]
}
At that point I disabled the /livemetrics polling but left the gc dump on 70% working-set in place. 4 days later the rule hit and failed with the 'CollectGCDump' failed/unexpected time out. It happened 3 times across different containers (the ones that failed were killed by our load balancer and restarted) within a few hrs.
I was able to somewhat repro this in our dev environment. I manually triggered a gc dump when memory usage was low on the task (660/4096mb). gc dump was successful (unlike in our prod environment where it failed completely trying to do a gc dump with memory at 6678/8192mb) but also got the "Unexpected timeout" and further requests stopped working:
{
"Timestamp": "2022-04-19T17:29:02.1838974Z",
"EventId": 2,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.LiveJwtBearerHandler",
"Message": "Successfully validated the token.",
"State": {
"Message": "Successfully validated the token.",
"{OriginalFormat}": "Successfully validated the token."
},
"Scopes": [
{
"Message": "SpanId:5d0a7651ef64ae52, TraceId:9087c513b99f163bfc5d9ad466a24d1c, ParentId:236c1fd495300808",
"SpanId": "5d0a7651ef64ae52",
"TraceId": "9087c513b99f163bfc5d9ad466a24d1c",
"ParentId": "236c1fd495300808"
},
{
"Message": "ConnectionId:0HMH22ID4PGJD",
"ConnectionId": "0HMH22ID4PGJD"
},
{
"Message": "RequestPath:/gcdump RequestId:0HMH22ID4PGJD:00000002",
"RequestId": "0HMH22ID4PGJD:00000002",
"RequestPath": "/gcdump"
}
]
}
{
"Timestamp": "2022-04-19T17:29:02.4566427Z",
"EventId": 52,
"LogLevel": "Warning",
"Category": "Microsoft.Diagnostics.Tools.Monitor.ServerEndpointInfoSource",
"Message": "Unexpected timeout from process 7. Process will no longer be monitored.",
"State": {
"Message": "Unexpected timeout from process 7. Process will no longer be monitored.",
"processId": "7",
"{OriginalFormat}": "Unexpected timeout from process {processId}. Process will no longer be monitored."
},
"Scopes": []
}
{
"Timestamp": "2022-04-19T17:29:02.4612395Z",
"EventId": 44,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "Stopping collection rules.",
"State": {
"Message": "Stopping collection rules.",
"{OriginalFormat}": "Stopping collection rules."
},
"Scopes": [
{
"Message": "TargetProcessId:7 TargetRuntimeInstanceCookie:c1de3bbadd104c679b6258ba69f40d69",
"TargetProcessId": "7",
"TargetRuntimeInstanceCookie": "c1de3bbadd104c679b6258ba69f40d69"
}
]
}
{
"Timestamp": "2022-04-19T17:29:02.7642136Z",
"EventId": 45,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "All collection rules have stopped.",
"State": {
"Message": "All collection rules have stopped.",
"{OriginalFormat}": "All collection rules have stopped."
},
"Scopes": [
{
"Message": "TargetProcessId:7 TargetRuntimeInstanceCookie:c1de3bbadd104c679b6258ba69f40d69",
"TargetProcessId": "7",
"TargetRuntimeInstanceCookie": "c1de3bbadd104c679b6258ba69f40d69"
}
]
}
{
"Timestamp": "2022-04-19T17:29:04.1995862Z",
"EventId": 5,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Monitoring.WebApi.OutputStreamResult",
"Message": "Written to HTTP stream.",
"State": {
"Message": "Written to HTTP stream.",
"{OriginalFormat}": "Written to HTTP stream."
},
"Scopes": [
{
"Message": "SpanId:5d0a7651ef64ae52, TraceId:9087c513b99f163bfc5d9ad466a24d1c, ParentId:236c1fd495300808",
"SpanId": "5d0a7651ef64ae52",
"TraceId": "9087c513b99f163bfc5d9ad466a24d1c",
"ParentId": "236c1fd495300808"
},
{
"Message": "ConnectionId:0HMH22ID4PGJD",
"ConnectionId": "0HMH22ID4PGJD"
},
{
"Message": "RequestPath:/gcdump RequestId:0HMH22ID4PGJD:00000002",
"RequestId": "0HMH22ID4PGJD:00000002",
"RequestPath": "/gcdump"
},
{
"Message": "Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.CaptureGcDump (Microsoft.Diagnostics.Monitoring.WebApi)",
"ActionId": "b011b956-5606-4899-a3e0-3b07d50140b9",
"ActionName": "Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.CaptureGcDump (Microsoft.Diagnostics.Monitoring.WebApi)"
},
{
"Message": "ArtifactType:gcdump ArtifactSource_ProcessId:7 ArtifactSource_RuntimeInstanceCookie:c1de3bbadd104c679b6258ba69f40d69",
"ArtifactType": "gcdump",
"ArtifactSource_ProcessId": "7",
"ArtifactSource_RuntimeInstanceCookie": "c1de3bbadd104c679b6258ba69f40d69"
}
]
}
{
"Timestamp": "2022-04-19T17:30:04.0298055Z",
"EventId": 2,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.LiveJwtBearerHandler",
"Message": "Successfully validated the token.",
"State": {
"Message": "Successfully validated the token.",
"{OriginalFormat}": "Successfully validated the token."
},
"Scopes": [
{
"Message": "SpanId:8f51c0ee93ee397a, TraceId:0e45b3f64604b64e4836c8e12e0a6c29, ParentId:54634c8d96e978c7",
"SpanId": "8f51c0ee93ee397a",
"TraceId": "0e45b3f64604b64e4836c8e12e0a6c29",
"ParentId": "54634c8d96e978c7"
},
{
"Message": "ConnectionId:0HMH22ID4PGJD",
"ConnectionId": "0HMH22ID4PGJD"
},
{
"Message": "RequestPath:/livemetrics RequestId:0HMH22ID4PGJD:00000003",
"RequestId": "0HMH22ID4PGJD:00000003",
"RequestPath": "/livemetrics"
}
]
}
{
"Timestamp": "2022-04-19T17:30:04.0309405Z",
"EventId": 1,
"LogLevel": "Error",
"Category": "Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController",
"Message": "Request failed.",
"Exception": "System.ArgumentException: Unable to discover a target process. at Microsoft.Diagnostics.Monitoring.WebApi.DiagnosticServices.GetProcessAsync(DiagProcessFilter processFilterConfig, CancellationToken token) in /_/src/Microsoft.Diagnostics.Monitoring.WebApi/DiagnosticServices.cs:line 100 at Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.<>c__DisplayClass33_0`1.<<InvokeForProcess>b__0>d.MoveNext() in /_/src/Microsoft.Diagnostics.Monitoring.WebApi/Controllers/DiagController.cs:line 713 --- End of stack trace from previous location --- at Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagControllerExtensions.InvokeService[T](ControllerBase controller, Func`1 serviceCall, ILogger logger) in /_/src/Microsoft.Diagnostics.Monitoring.WebApi/Controllers/DiagControllerExtensions.cs:line 91",
"State": {
"Message": "Request failed.",
"{OriginalFormat}": "Request failed."
},
"Scopes": [
{
"Message": "SpanId:8f51c0ee93ee397a, TraceId:0e45b3f64604b64e4836c8e12e0a6c29, ParentId:54634c8d96e978c7",
"SpanId": "8f51c0ee93ee397a",
"TraceId": "0e45b3f64604b64e4836c8e12e0a6c29",
"ParentId": "54634c8d96e978c7"
},
{
"Message": "ConnectionId:0HMH22ID4PGJD",
"ConnectionId": "0HMH22ID4PGJD"
},
{
"Message": "RequestPath:/livemetrics RequestId:0HMH22ID4PGJD:00000003",
"RequestId": "0HMH22ID4PGJD:00000003",
"RequestPath": "/livemetrics"
},
{
"Message": "Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.CaptureMetrics (Microsoft.Diagnostics.Monitoring.WebApi)",
"ActionId": "79857f94-77a3-4950-9596-99bb1e874bff",
"ActionName": "Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.CaptureMetrics (Microsoft.Diagnostics.Monitoring.WebApi)"
},
{
"Message": "ArtifactType:livemetrics",
"ArtifactType": "livemetrics"
}
]
}
When want to eliminate (or possibly blame) the possibility that the socket file has been deleted by something, which would cause the subsequent reconnection of the runtime to the dotnet monitor
tool to not occur.
/tmp/port
socket file still exists?DotnetMonitor_*
settings from the /tmp
directory to another shared mounted directory? Our examples typically use /diag
as the directory.Hi,
Thank you for looking into this! I can confirm that /tmp/port still exists after timeout. I did get a slightly different behavior today though. GCDump worked just fine several times on a container that was launched only 2hrs ago with next to no traffic going to it (in our dev environment). Eventually I did get the time out in the logs but gcdump succeeded and further requests kept on working (where as yesterday 'unable to discover target process') - I did remove the collection rule that was the one difference I'm going to add it back and see if it changes anything (I'm testing through hitting the http endpoint though so not sure that it matters). I will update the settings to /diag as well and report back.
{
"Timestamp": "2022-04-20T14:53:07.4891490Z",
"EventId": 2,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.LiveJwtBearerHandler",
"Message": "Successfully validated the token.",
"State": {
"Message": "Successfully validated the token.",
"{OriginalFormat}": "Successfully validated the token."
},
"Scopes": [
{
"Message": "SpanId:653424d1c4071427, TraceId:337acd08f914a23c757f608b79ab2b21, ParentId:0000000000000000",
"SpanId": "653424d1c4071427",
"TraceId": "337acd08f914a23c757f608b79ab2b21",
"ParentId": "0000000000000000"
},
{
"Message": "ConnectionId:0HMH2P607OT3P",
"ConnectionId": "0HMH2P607OT3P"
},
{
"Message": "RequestPath:/gcdump RequestId:0HMH2P607OT3P:00000002",
"RequestId": "0HMH2P607OT3P:00000002",
"RequestPath": "/gcdump"
}
]
}
{
"Timestamp": "2022-04-20T14:53:08.0951240Z",
"EventId": 52,
"LogLevel": "Warning",
"Category": "Microsoft.Diagnostics.Tools.Monitor.ServerEndpointInfoSource",
"Message": "Unexpected timeout from process 7. Process will no longer be monitored.",
"State": {
"Message": "Unexpected timeout from process 7. Process will no longer be monitored.",
"processId": "7",
"{OriginalFormat}": "Unexpected timeout from process {processId}. Process will no longer be monitored."
},
"Scopes": []
}
{
"Timestamp": "2022-04-20T14:53:08.0998805Z",
"EventId": 44,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "Stopping collection rules.",
"State": {
"Message": "Stopping collection rules.",
"{OriginalFormat}": "Stopping collection rules."
},
"Scopes": [
{
"Message": "TargetProcessId:7 TargetRuntimeInstanceCookie:8e064fa4979846d497866ec3cad84db3",
"TargetProcessId": "7",
"TargetRuntimeInstanceCookie": "8e064fa4979846d497866ec3cad84db3"
}
]
}
{
"Timestamp": "2022-04-20T14:53:08.1010130Z",
"EventId": 45,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "All collection rules have stopped.",
"State": {
"Message": "All collection rules have stopped.",
"{OriginalFormat}": "All collection rules have stopped."
},
"Scopes": [
{
"Message": "TargetProcessId:7 TargetRuntimeInstanceCookie:8e064fa4979846d497866ec3cad84db3",
"TargetProcessId": "7",
"TargetRuntimeInstanceCookie": "8e064fa4979846d497866ec3cad84db3"
}
]
}
{
"Timestamp": "2022-04-20T14:53:13.4598955Z",
"EventId": 4,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Monitoring.WebApi.EgressOperation",
"Message": "Egressed artifact to /tmp/gcdump/20220420_145307_493_7.gcdump",
"State": {
"Message": "Egressed artifact to /tmp/gcdump/20220420_145307_493_7.gcdump",
"location": "/tmp/gcdump/20220420_145307_493_7.gcdump",
"{OriginalFormat}": "Egressed artifact to {location}"
},
"Scopes": [
{
"Message": "ArtifactType:gcdump ArtifactSource_ProcessId:7 ArtifactSource_RuntimeInstanceCookie:8e064fa4979846d497866ec3cad84db3",
"ArtifactType": "gcdump",
"ArtifactSource_ProcessId": "7",
"ArtifactSource_RuntimeInstanceCookie": "8e064fa4979846d497866ec3cad84db3"
}
]
}
{
"Timestamp": "2022-04-20T14:54:18.5851347Z",
"EventId": 2,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.LiveJwtBearerHandler",
"Message": "Successfully validated the token.",
"State": {
"Message": "Successfully validated the token.",
"{OriginalFormat}": "Successfully validated the token."
},
"Scopes": [
{
"Message": "SpanId:417bbd375cbd0870, TraceId:157005f4185c24249dea2f87133b890f, ParentId:0000000000000000",
"SpanId": "417bbd375cbd0870",
"TraceId": "157005f4185c24249dea2f87133b890f",
"ParentId": "0000000000000000"
},
{
"Message": "ConnectionId:0HMH2P607OT3Q",
"ConnectionId": "0HMH2P607OT3Q"
},
{
"Message": "RequestPath:/gcdump RequestId:0HMH2P607OT3Q:00000002",
"RequestId": "0HMH2P607OT3Q:00000002",
"RequestPath": "/gcdump"
}
]
}
{
"Timestamp": "2022-04-20T14:54:20.2243724Z",
"EventId": 4,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Monitoring.WebApi.EgressOperation",
"Message": "Egressed artifact to /tmp/gcdump/20220420_145418_585_7.gcdump",
"State": {
"Message": "Egressed artifact to /tmp/gcdump/20220420_145418_585_7.gcdump",
"location": "/tmp/gcdump/20220420_145418_585_7.gcdump",
"{OriginalFormat}": "Egressed artifact to {location}"
},
"Scopes": [
{
"Message": "ArtifactType:gcdump ArtifactSource_ProcessId:7 ArtifactSource_RuntimeInstanceCookie:8e064fa4979846d497866ec3cad84db3",
"ArtifactType": "gcdump",
"ArtifactSource_ProcessId": "7",
"ArtifactSource_RuntimeInstanceCookie": "8e064fa4979846d497866ec3cad84db3"
}
]
}
I re-enabled the collection rule on high memory. Then started making calls to /gcdump endpoint (rule never triggered). Freshly launched container the first 10 or so gcdumps were successful then it failed again with timeout. This time around further requests are failing again. /tmp/port is still there. Not sure if it's random luck or because I re-enabled the rule (my prior comment was testing without rule got it to timeout but further requests kept on working):
{
"Timestamp": "2022-04-20T16:19:11.1758418Z",
"EventId": 2,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.LiveJwtBearerHandler",
"Message": "Successfully validated the token.",
"State": {
"Message": "Successfully validated the token.",
"{OriginalFormat}": "Successfully validated the token."
},
"Scopes": [
{
"Message": "SpanId:d08610fd79f115b9, TraceId:0333d198dc6ad02684724b0425818337, ParentId:0000000000000000",
"SpanId": "d08610fd79f115b9",
"TraceId": "0333d198dc6ad02684724b0425818337",
"ParentId": "0000000000000000"
},
{
"Message": "ConnectionId:0HMH2T3ELTG56",
"ConnectionId": "0HMH2T3ELTG56"
},
{
"Message": "RequestPath:/gcdump RequestId:0HMH2T3ELTG56:00000002",
"RequestId": "0HMH2T3ELTG56:00000002",
"RequestPath": "/gcdump"
}
]
}
{
"Timestamp": "2022-04-20T16:19:13.8908825Z",
"EventId": 4,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Monitoring.WebApi.EgressOperation",
"Message": "Egressed artifact to /tmp/gcdump/20220420_161911_176_9.gcdump",
"State": {
"Message": "Egressed artifact to /tmp/gcdump/20220420_161911_176_9.gcdump",
"location": "/tmp/gcdump/20220420_161911_176_9.gcdump",
"{OriginalFormat}": "Egressed artifact to {location}"
},
"Scopes": [
{
"Message": "ArtifactType:gcdump ArtifactSource_ProcessId:9 ArtifactSource_RuntimeInstanceCookie:672855941e7f4fc2a9f5497110c9a870",
"ArtifactType": "gcdump",
"ArtifactSource_ProcessId": "9",
"ArtifactSource_RuntimeInstanceCookie": "672855941e7f4fc2a9f5497110c9a870"
}
]
}
{
"Timestamp": "2022-04-20T16:19:17.4243122Z",
"EventId": 2,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.LiveJwtBearerHandler",
"Message": "Successfully validated the token.",
"State": {
"Message": "Successfully validated the token.",
"{OriginalFormat}": "Successfully validated the token."
},
"Scopes": [
{
"Message": "SpanId:a7d39e4d1c10fd33, TraceId:b3dec8791469c2c5776ef9026c041fac, ParentId:0000000000000000",
"SpanId": "a7d39e4d1c10fd33",
"TraceId": "b3dec8791469c2c5776ef9026c041fac",
"ParentId": "0000000000000000"
},
{
"Message": "ConnectionId:0HMH2T3ELTG57",
"ConnectionId": "0HMH2T3ELTG57"
},
{
"Message": "RequestPath:/gcdump RequestId:0HMH2T3ELTG57:00000002",
"RequestId": "0HMH2T3ELTG57:00000002",
"RequestPath": "/gcdump"
}
]
}
{
"Timestamp": "2022-04-20T16:19:17.9434247Z",
"EventId": 52,
"LogLevel": "Warning",
"Category": "Microsoft.Diagnostics.Tools.Monitor.ServerEndpointInfoSource",
"Message": "Unexpected timeout from process 9. Process will no longer be monitored.",
"State": {
"Message": "Unexpected timeout from process 9. Process will no longer be monitored.",
"processId": "9",
"{OriginalFormat}": "Unexpected timeout from process {processId}. Process will no longer be monitored."
},
"Scopes": []
}
{
"Timestamp": "2022-04-20T16:19:17.9476126Z",
"EventId": 44,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "Stopping collection rules.",
"State": {
"Message": "Stopping collection rules.",
"{OriginalFormat}": "Stopping collection rules."
},
"Scopes": [
{
"Message": "TargetProcessId:9 TargetRuntimeInstanceCookie:672855941e7f4fc2a9f5497110c9a870",
"TargetProcessId": "9",
"TargetRuntimeInstanceCookie": "672855941e7f4fc2a9f5497110c9a870"
}
]
}
{
"Timestamp": "2022-04-20T16:19:18.1395307Z",
"EventId": 45,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "All collection rules have stopped.",
"State": {
"Message": "All collection rules have stopped.",
"{OriginalFormat}": "All collection rules have stopped."
},
"Scopes": [
{
"Message": "TargetProcessId:9 TargetRuntimeInstanceCookie:672855941e7f4fc2a9f5497110c9a870",
"TargetProcessId": "9",
"TargetRuntimeInstanceCookie": "672855941e7f4fc2a9f5497110c9a870"
}
]
}
{
"Timestamp": "2022-04-20T16:19:20.7208460Z",
"EventId": 4,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Monitoring.WebApi.EgressOperation",
"Message": "Egressed artifact to /tmp/gcdump/20220420_161917_425_9.gcdump",
"State": {
"Message": "Egressed artifact to /tmp/gcdump/20220420_161917_425_9.gcdump",
"location": "/tmp/gcdump/20220420_161917_425_9.gcdump",
"{OriginalFormat}": "Egressed artifact to {location}"
},
"Scopes": [
{
"Message": "ArtifactType:gcdump ArtifactSource_ProcessId:9 ArtifactSource_RuntimeInstanceCookie:672855941e7f4fc2a9f5497110c9a870",
"ArtifactType": "gcdump",
"ArtifactSource_ProcessId": "9",
"ArtifactSource_RuntimeInstanceCookie": "672855941e7f4fc2a9f5497110c9a870"
}
]
}
{
"Timestamp": "2022-04-20T16:19:38.0013028Z",
"EventId": 2,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.LiveJwtBearerHandler",
"Message": "Successfully validated the token.",
"State": {
"Message": "Successfully validated the token.",
"{OriginalFormat}": "Successfully validated the token."
},
"Scopes": [
{
"Message": "SpanId:33460be09a652da9, TraceId:a8fa906d28d743cbba5ba7fa142d9bce, ParentId:0000000000000000",
"SpanId": "33460be09a652da9",
"TraceId": "a8fa906d28d743cbba5ba7fa142d9bce",
"ParentId": "0000000000000000"
},
{
"Message": "ConnectionId:0HMH2T3ELTG58",
"ConnectionId": "0HMH2T3ELTG58"
},
{
"Message": "RequestPath:/gcdump RequestId:0HMH2T3ELTG58:00000002",
"RequestId": "0HMH2T3ELTG58:00000002",
"RequestPath": "/gcdump"
}
]
}
{
"Timestamp": "2022-04-20T16:19:38.0032943Z",
"EventId": 1,
"LogLevel": "Error",
"Category": "Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController",
"Message": "Request failed.",
"Exception": "System.ArgumentException: Unable to discover a target process. at Microsoft.Diagnostics.Monitoring.WebApi.DiagnosticServices.GetProcessAsync(DiagProcessFilter processFilterConfig, CancellationToken token) in /_/src/Microsoft.Diagnostics.Monitoring.WebApi/DiagnosticServices.cs:line 100 at Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.<>c__DisplayClass33_0`1.<<InvokeForProcess>b__0>d.MoveNext() in /_/src/Microsoft.Diagnostics.Monitoring.WebApi/Controllers/DiagController.cs:line 704 --- End of stack trace from previous location --- at Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagControllerExtensions.InvokeService[T](ControllerBase controller, Func`1 serviceCall, ILogger logger) in /_/src/Microsoft.Diagnostics.Monitoring.WebApi/Controllers/DiagControllerExtensions.cs:line 57",
"State": {
"Message": "Request failed.",
"{OriginalFormat}": "Request failed."
},
"Scopes": [
{
"Message": "SpanId:33460be09a652da9, TraceId:a8fa906d28d743cbba5ba7fa142d9bce, ParentId:0000000000000000",
"SpanId": "33460be09a652da9",
"TraceId": "a8fa906d28d743cbba5ba7fa142d9bce",
"ParentId": "0000000000000000"
},
{
"Message": "ConnectionId:0HMH2T3ELTG58",
"ConnectionId": "0HMH2T3ELTG58"
},
{
"Message": "RequestPath:/gcdump RequestId:0HMH2T3ELTG58:00000002",
"RequestId": "0HMH2T3ELTG58:00000002",
"RequestPath": "/gcdump"
},
{
"Message": "Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.CaptureGcDump (Microsoft.Diagnostics.Monitoring.WebApi)",
"ActionId": "fc42e51f-5d23-4dff-bef0-3a2f0d017235",
"ActionName": "Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.CaptureGcDump (Microsoft.Diagnostics.Monitoring.WebApi)"
},
{
"Message": "ArtifactType:gcdump",
"ArtifactType": "gcdump"
}
]
}
So I moved the port to /diag/port and DumpsTempFolder to /dumps like the docs recommended. Same problem :-( Newly spun up container first couple gc dumps work, then get "unexpected timeout" but gc dump works and I can make subsequent requests BUT then it happens again with "unexpected timeout" and then goes to "unable to discover a target process" (/diag/port still exists):
{
"Timestamp": "2022-04-20T18:50:07.4083755Z",
"EventId": 2,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.LiveJwtBearerHandler",
"Message": "Successfully validated the token.",
"State": {
"Message": "Successfully validated the token.",
"{OriginalFormat}": "Successfully validated the token."
},
"Scopes": [
{
"Message": "SpanId:8d03d15bc184093c, TraceId:f0e5b86b4163b69ac1a9173ac243f82f, ParentId:0000000000000000",
"SpanId": "8d03d15bc184093c",
"TraceId": "f0e5b86b4163b69ac1a9173ac243f82f",
"ParentId": "0000000000000000"
},
{
"Message": "ConnectionId:0HMH2V9I11MVO",
"ConnectionId": "0HMH2V9I11MVO"
},
{
"Message": "RequestPath:/gcdump RequestId:0HMH2V9I11MVO:00000002",
"RequestId": "0HMH2V9I11MVO:00000002",
"RequestPath": "/gcdump"
}
]
}
{
"Timestamp": "2022-04-20T18:50:09.1526928Z",
"EventId": 52,
"LogLevel": "Warning",
"Category": "Microsoft.Diagnostics.Tools.Monitor.ServerEndpointInfoSource",
"Message": "Unexpected timeout from process 7. Process will no longer be monitored.",
"State": {
"Message": "Unexpected timeout from process 7. Process will no longer be monitored.",
"processId": "7",
"{OriginalFormat}": "Unexpected timeout from process {processId}. Process will no longer be monitored."
},
"Scopes": []
}
{
"Timestamp": "2022-04-20T18:50:09.1659161Z",
"EventId": 44,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "Stopping collection rules.",
"State": {
"Message": "Stopping collection rules.",
"{OriginalFormat}": "Stopping collection rules."
},
"Scopes": [
{
"Message": "TargetProcessId:7 TargetRuntimeInstanceCookie:b53e1d6ac48e4dcb8dcf59857e24a17f",
"TargetProcessId": "7",
"TargetRuntimeInstanceCookie": "b53e1d6ac48e4dcb8dcf59857e24a17f"
}
]
}
{
"Timestamp": "2022-04-20T18:50:09.2524267Z",
"EventId": 40,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "Starting collection rules.",
"State": {
"Message": "Starting collection rules.",
"{OriginalFormat}": "Starting collection rules."
},
"Scopes": [
{
"Message": "TargetProcessId:7 TargetRuntimeInstanceCookie:b53e1d6ac48e4dcb8dcf59857e24a17f",
"TargetProcessId": "7",
"TargetRuntimeInstanceCookie": "b53e1d6ac48e4dcb8dcf59857e24a17f"
}
]
}
{
"Timestamp": "2022-04-20T18:50:09.2525991Z",
"EventId": 29,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "Collection rule 'HighMemoryRule' started.",
"State": {
"Message": "Collection rule 'HighMemoryRule' started.",
"ruleName": "HighMemoryRule",
"{OriginalFormat}": "Collection rule '{ruleName}' started."
},
"Scopes": [
{
"Message": "TargetProcessId:7 TargetRuntimeInstanceCookie:b53e1d6ac48e4dcb8dcf59857e24a17f CollectionRuleName:HighMemoryRule",
"TargetProcessId": "7",
"TargetRuntimeInstanceCookie": "b53e1d6ac48e4dcb8dcf59857e24a17f",
"CollectionRuleName": "HighMemoryRule"
}
]
}
{
"Timestamp": "2022-04-20T18:50:09.2527364Z",
"EventId": 35,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "Collection rule 'HighMemoryRule' trigger 'EventCounter' started.",
"State": {
"Message": "Collection rule 'HighMemoryRule' trigger 'EventCounter' started.",
"ruleName": "HighMemoryRule",
"triggerType": "EventCounter",
"{OriginalFormat}": "Collection rule '{ruleName}' trigger '{triggerType}' started."
},
"Scopes": [
{
"Message": "TargetProcessId:7 TargetRuntimeInstanceCookie:b53e1d6ac48e4dcb8dcf59857e24a17f CollectionRuleName:HighMemoryRule",
"TargetProcessId": "7",
"TargetRuntimeInstanceCookie": "b53e1d6ac48e4dcb8dcf59857e24a17f",
"CollectionRuleName": "HighMemoryRule"
},
{
"Message": "CollectionRuleTriggerType:EventCounter",
"CollectionRuleTriggerType": "EventCounter"
}
]
}
{
"Timestamp": "2022-04-20T18:50:09.2793160Z",
"EventId": 32,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "All collection rules started.",
"State": {
"Message": "All collection rules started.",
"{OriginalFormat}": "All collection rules started."
},
"Scopes": [
{
"Message": "TargetProcessId:7 TargetRuntimeInstanceCookie:b53e1d6ac48e4dcb8dcf59857e24a17f",
"TargetProcessId": "7",
"TargetRuntimeInstanceCookie": "b53e1d6ac48e4dcb8dcf59857e24a17f"
}
]
}
{
"Timestamp": "2022-04-20T18:50:09.3916058Z",
"EventId": 45,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "All collection rules have stopped.",
"State": {
"Message": "All collection rules have stopped.",
"{OriginalFormat}": "All collection rules have stopped."
},
"Scopes": [
{
"Message": "TargetProcessId:7 TargetRuntimeInstanceCookie:b53e1d6ac48e4dcb8dcf59857e24a17f",
"TargetProcessId": "7",
"TargetRuntimeInstanceCookie": "b53e1d6ac48e4dcb8dcf59857e24a17f"
}
]
}
{
"Timestamp": "2022-04-20T18:50:11.0080567Z",
"EventId": 4,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Monitoring.WebApi.EgressOperation",
"Message": "Egressed artifact to /tmp/gcdump/20220420_185007_410_7.gcdump",
"State": {
"Message": "Egressed artifact to /tmp/gcdump/20220420_185007_410_7.gcdump",
"location": "/tmp/gcdump/20220420_185007_410_7.gcdump",
"{OriginalFormat}": "Egressed artifact to {location}"
},
"Scopes": [
{
"Message": "ArtifactType:gcdump ArtifactSource_ProcessId:7 ArtifactSource_RuntimeInstanceCookie:b53e1d6ac48e4dcb8dcf59857e24a17f",
"ArtifactType": "gcdump",
"ArtifactSource_ProcessId": "7",
"ArtifactSource_RuntimeInstanceCookie": "b53e1d6ac48e4dcb8dcf59857e24a17f"
}
]
}
{
"Timestamp": "2022-04-20T18:50:11.0699909Z",
"EventId": 5,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Monitoring.WebApi.OutputStreamResult",
"Message": "Written to HTTP stream.",
"State": {
"Message": "Written to HTTP stream.",
"{OriginalFormat}": "Written to HTTP stream."
},
"Scopes": [
{
"Message": "SpanId:ec6999cc44d5cd93, TraceId:b3ad826b2977b98b0d239fdd94164cba, ParentId:8eefce7af93b1945",
"SpanId": "ec6999cc44d5cd93",
"TraceId": "b3ad826b2977b98b0d239fdd94164cba",
"ParentId": "8eefce7af93b1945"
},
{
"Message": "ConnectionId:0HMH2V9I11MVN",
"ConnectionId": "0HMH2V9I11MVN"
},
{
"Message": "RequestPath:/livemetrics RequestId:0HMH2V9I11MVN:00000002",
"RequestId": "0HMH2V9I11MVN:00000002",
"RequestPath": "/livemetrics"
},
{
"Message": "Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.CaptureMetrics (Microsoft.Diagnostics.Monitoring.WebApi)",
"ActionId": "7285c1f5-03cd-4516-a55c-b35a48ba044a",
"ActionName": "Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.CaptureMetrics (Microsoft.Diagnostics.Monitoring.WebApi)"
},
{
"Message": "ArtifactType:livemetrics ArtifactSource_ProcessId:7 ArtifactSource_RuntimeInstanceCookie:b53e1d6ac48e4dcb8dcf59857e24a17f",
"ArtifactType": "livemetrics",
"ArtifactSource_ProcessId": "7",
"ArtifactSource_RuntimeInstanceCookie": "b53e1d6ac48e4dcb8dcf59857e24a17f"
}
]
}
{
"Timestamp": "2022-04-20T18:50:30.6010840Z",
"EventId": 2,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.LiveJwtBearerHandler",
"Message": "Successfully validated the token.",
"State": {
"Message": "Successfully validated the token.",
"{OriginalFormat}": "Successfully validated the token."
},
"Scopes": [
{
"Message": "SpanId:7ca19d809cb0a1dc, TraceId:58a4a0511e3aaf467a748a19d2646acb, ParentId:0000000000000000",
"SpanId": "7ca19d809cb0a1dc",
"TraceId": "58a4a0511e3aaf467a748a19d2646acb",
"ParentId": "0000000000000000"
},
{
"Message": "ConnectionId:0HMH2V9I11MVP",
"ConnectionId": "0HMH2V9I11MVP"
},
{
"Message": "RequestPath:/gcdump RequestId:0HMH2V9I11MVP:00000002",
"RequestId": "0HMH2V9I11MVP:00000002",
"RequestPath": "/gcdump"
}
]
}
{
"Timestamp": "2022-04-20T18:50:34.3168668Z",
"EventId": 4,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Monitoring.WebApi.EgressOperation",
"Message": "Egressed artifact to /tmp/gcdump/20220420_185030_606_7.gcdump",
"State": {
"Message": "Egressed artifact to /tmp/gcdump/20220420_185030_606_7.gcdump",
"location": "/tmp/gcdump/20220420_185030_606_7.gcdump",
"{OriginalFormat}": "Egressed artifact to {location}"
},
"Scopes": [
{
"Message": "ArtifactType:gcdump ArtifactSource_ProcessId:7 ArtifactSource_RuntimeInstanceCookie:b53e1d6ac48e4dcb8dcf59857e24a17f",
"ArtifactType": "gcdump",
"ArtifactSource_ProcessId": "7",
"ArtifactSource_RuntimeInstanceCookie": "b53e1d6ac48e4dcb8dcf59857e24a17f"
}
]
}
{
"Timestamp": "2022-04-20T18:51:12.3892161Z",
"EventId": 2,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.LiveJwtBearerHandler",
"Message": "Successfully validated the token.",
"State": {
"Message": "Successfully validated the token.",
"{OriginalFormat}": "Successfully validated the token."
},
"Scopes": [
{
"Message": "SpanId:ad90d96e4a48dafe, TraceId:18da0341234ecb0cce826273da5385a6, ParentId:0000000000000000",
"SpanId": "ad90d96e4a48dafe",
"TraceId": "18da0341234ecb0cce826273da5385a6",
"ParentId": "0000000000000000"
},
{
"Message": "ConnectionId:0HMH2V9I11MVR",
"ConnectionId": "0HMH2V9I11MVR"
},
{
"Message": "RequestPath:/gcdump RequestId:0HMH2V9I11MVR:00000002",
"RequestId": "0HMH2V9I11MVR:00000002",
"RequestPath": "/gcdump"
}
]
}
{
"Timestamp": "2022-04-20T18:51:12.8116760Z",
"EventId": 52,
"LogLevel": "Warning",
"Category": "Microsoft.Diagnostics.Tools.Monitor.ServerEndpointInfoSource",
"Message": "Unexpected timeout from process 7. Process will no longer be monitored.",
"State": {
"Message": "Unexpected timeout from process 7. Process will no longer be monitored.",
"processId": "7",
"{OriginalFormat}": "Unexpected timeout from process {processId}. Process will no longer be monitored."
},
"Scopes": []
}
{
"Timestamp": "2022-04-20T18:51:12.8119827Z",
"EventId": 44,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "Stopping collection rules.",
"State": {
"Message": "Stopping collection rules.",
"{OriginalFormat}": "Stopping collection rules."
},
"Scopes": [
{
"Message": "TargetProcessId:7 TargetRuntimeInstanceCookie:b53e1d6ac48e4dcb8dcf59857e24a17f",
"TargetProcessId": "7",
"TargetRuntimeInstanceCookie": "b53e1d6ac48e4dcb8dcf59857e24a17f"
}
]
}
{
"Timestamp": "2022-04-20T18:51:13.0701847Z",
"EventId": 45,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "All collection rules have stopped.",
"State": {
"Message": "All collection rules have stopped.",
"{OriginalFormat}": "All collection rules have stopped."
},
"Scopes": [
{
"Message": "TargetProcessId:7 TargetRuntimeInstanceCookie:b53e1d6ac48e4dcb8dcf59857e24a17f",
"TargetProcessId": "7",
"TargetRuntimeInstanceCookie": "b53e1d6ac48e4dcb8dcf59857e24a17f"
}
]
}
{
"Timestamp": "2022-04-20T18:51:15.8297924Z",
"EventId": 4,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Monitoring.WebApi.EgressOperation",
"Message": "Egressed artifact to /tmp/gcdump/20220420_185112_394_7.gcdump",
"State": {
"Message": "Egressed artifact to /tmp/gcdump/20220420_185112_394_7.gcdump",
"location": "/tmp/gcdump/20220420_185112_394_7.gcdump",
"{OriginalFormat}": "Egressed artifact to {location}"
},
"Scopes": [
{
"Message": "ArtifactType:gcdump ArtifactSource_ProcessId:7 ArtifactSource_RuntimeInstanceCookie:b53e1d6ac48e4dcb8dcf59857e24a17f",
"ArtifactType": "gcdump",
"ArtifactSource_ProcessId": "7",
"ArtifactSource_RuntimeInstanceCookie": "b53e1d6ac48e4dcb8dcf59857e24a17f"
}
]
}
{
"Timestamp": "2022-04-20T18:51:28.7445777Z",
"EventId": 2,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.LiveJwtBearerHandler",
"Message": "Successfully validated the token.",
"State": {
"Message": "Successfully validated the token.",
"{OriginalFormat}": "Successfully validated the token."
},
"Scopes": [
{
"Message": "SpanId:882994a4060b7795, TraceId:3c8ad5e70d400f8acc73f15c6e5e3af9, ParentId:0000000000000000",
"SpanId": "882994a4060b7795",
"TraceId": "3c8ad5e70d400f8acc73f15c6e5e3af9",
"ParentId": "0000000000000000"
},
{
"Message": "ConnectionId:0HMH2V9I11MVS",
"ConnectionId": "0HMH2V9I11MVS"
},
{
"Message": "RequestPath:/gcdump RequestId:0HMH2V9I11MVS:00000002",
"RequestId": "0HMH2V9I11MVS:00000002",
"RequestPath": "/gcdump"
}
]
}
{
"Timestamp": "2022-04-20T18:51:28.7681109Z",
"EventId": 1,
"LogLevel": "Error",
"Category": "Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController",
"Message": "Request failed.",
"Exception": "System.ArgumentException: Unable to discover a target process. at Microsoft.Diagnostics.Monitoring.WebApi.DiagnosticServices.GetProcessAsync(DiagProcessFilter processFilterConfig, CancellationToken token) in /_/src/Microsoft.Diagnostics.Monitoring.WebApi/DiagnosticServices.cs:line 100 at Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.<>c__DisplayClass33_0`1.<<InvokeForProcess>b__0>d.MoveNext() in /_/src/Microsoft.Diagnostics.Monitoring.WebApi/Controllers/DiagController.cs:line 713 --- End of stack trace from previous location --- at Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagControllerExtensions.InvokeService[T](ControllerBase controller, Func`1 serviceCall, ILogger logger) in /_/src/Microsoft.Diagnostics.Monitoring.WebApi/Controllers/DiagControllerExtensions.cs:line 91",
"State": {
"Message": "Request failed.",
"{OriginalFormat}": "Request failed."
},
"Scopes": [
{
"Message": "SpanId:882994a4060b7795, TraceId:3c8ad5e70d400f8acc73f15c6e5e3af9, ParentId:0000000000000000",
"SpanId": "882994a4060b7795",
"TraceId": "3c8ad5e70d400f8acc73f15c6e5e3af9",
"ParentId": "0000000000000000"
},
{
"Message": "ConnectionId:0HMH2V9I11MVS",
"ConnectionId": "0HMH2V9I11MVS"
},
{
"Message": "RequestPath:/gcdump RequestId:0HMH2V9I11MVS:00000002",
"RequestId": "0HMH2V9I11MVS:00000002",
"RequestPath": "/gcdump"
},
{
"Message": "Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.CaptureGcDump (Microsoft.Diagnostics.Monitoring.WebApi)",
"ActionId": "ae104f35-1f39-4b8f-84f3-014747a5c8b0",
"ActionName": "Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.CaptureGcDump (Microsoft.Diagnostics.Monitoring.WebApi)"
},
{
"Message": "ArtifactType:gcdump",
"ArtifactType": "gcdump"
}
]
}
I have the same problem. I installed dotnet-monitor on 2 deployments. One of them works fine, but the other looses connection after some time.
I start the deployment make a few gcdumps over the endpoint, then come back 30 minutes later and it does not work anymore.
EDIT: When I restart the monitor container (not the pod) it works again. EDIT 2: The restart has worked three times now.
I just saw a new version came out that had code changes to address this. I re-enabled the container in our development environment. It is looking good so far - as in I took 30+ gc dumps continuously to test with no issues where as previously I could trigger the scenario above to occur fairly quickly. Will deploy the change to production later this week to run against real-world usage hopefully looks good there as well. Thank you!
edit: Closing this ticket as well so I don't forget since I believe this to be working now. I'll re-open if I see different results in our prod environment.
So we just deployed the latest dotnet monitor to prod. I re-enabled hitting /livemetrics every 5 minutes ... within 30 minutes or so 1 out of 4 containers started getting the same error message from dotnet monitor :-( . There was no GC dumps taken and the container never stopped processing user requests successfully.
{
"Timestamp": "2022-07-01T14:00:01.0111921Z",
"EventId": 2,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.LiveJwtBearerHandler",
"Message": "Successfully validated the token.",
"State": {
"Message": "Successfully validated the token.",
"{OriginalFormat}": "Successfully validated the token."
},
"Scopes": [
{
"Message": "SpanId:9fa40d2041202592, TraceId:ae38f7c3e4d83e8559019aee39ea50d9, ParentId:dbe31cb4d270f899",
"SpanId": "9fa40d2041202592",
"TraceId": "ae38f7c3e4d83e8559019aee39ea50d9",
"ParentId": "dbe31cb4d270f899"
},
{
"Message": "ConnectionId:0HMIRCMP5M5GF",
"ConnectionId": "0HMIRCMP5M5GF"
},
{
"Message": "RequestPath:/livemetrics RequestId:0HMIRCMP5M5GF:00000002",
"RequestId": "0HMIRCMP5M5GF:00000002",
"RequestPath": "/livemetrics"
}
]
}
{
"Timestamp": "2022-07-01T14:00:08.1066647Z",
"EventId": 5,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Monitoring.WebApi.OutputStreamResult",
"Message": "Written to HTTP stream.",
"State": {
"Message": "Written to HTTP stream.",
"{OriginalFormat}": "Written to HTTP stream."
},
"Scopes": [
{
"Message": "SpanId:9fa40d2041202592, TraceId:ae38f7c3e4d83e8559019aee39ea50d9, ParentId:dbe31cb4d270f899",
"SpanId": "9fa40d2041202592",
"TraceId": "ae38f7c3e4d83e8559019aee39ea50d9",
"ParentId": "dbe31cb4d270f899"
},
{
"Message": "ConnectionId:0HMIRCMP5M5GF",
"ConnectionId": "0HMIRCMP5M5GF"
},
{
"Message": "RequestPath:/livemetrics RequestId:0HMIRCMP5M5GF:00000002",
"RequestId": "0HMIRCMP5M5GF:00000002",
"RequestPath": "/livemetrics"
},
{
"Message": "Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.CaptureMetrics (Microsoft.Diagnostics.Monitoring.WebApi)",
"ActionId": "8accbb03-0591-4ce4-8407-5a631e02c09d",
"ActionName": "Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.CaptureMetrics (Microsoft.Diagnostics.Monitoring.WebApi)"
},
{
"Message": "ArtifactType:livemetrics ArtifactSource_ProcessId:8 ArtifactSource_RuntimeInstanceCookie:08449004064f4286bffdf0f33935d4b3",
"ArtifactType": "livemetrics",
"ArtifactSource_ProcessId": "8",
"ArtifactSource_RuntimeInstanceCookie": "08449004064f4286bffdf0f33935d4b3"
}
]
}
{
"Timestamp": "2022-07-01T14:05:02.0873006Z",
"EventId": 2,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.LiveJwtBearerHandler",
"Message": "Successfully validated the token.",
"State": {
"Message": "Successfully validated the token.",
"{OriginalFormat}": "Successfully validated the token."
},
"Scopes": [
{
"Message": "SpanId:30687ae247c20cfb, TraceId:ef5426eeaf912d015c59f0eebf0bf9e5, ParentId:2d02014ed370a9e5",
"SpanId": "30687ae247c20cfb",
"TraceId": "ef5426eeaf912d015c59f0eebf0bf9e5",
"ParentId": "2d02014ed370a9e5"
},
{
"Message": "ConnectionId:0HMIRCMP5M5GG",
"ConnectionId": "0HMIRCMP5M5GG"
},
{
"Message": "RequestPath:/livemetrics RequestId:0HMIRCMP5M5GG:00000002",
"RequestId": "0HMIRCMP5M5GG:00000002",
"RequestPath": "/livemetrics"
}
]
}
{
"Timestamp": "2022-07-01T14:05:08.0963972Z",
"EventId": 52,
"LogLevel": "Warning",
"Category": "Microsoft.Diagnostics.Tools.Monitor.ServerEndpointInfoSource",
"Message": "Unexpected timeout from process 8. Process will no longer be monitored.",
"State": {
"Message": "Unexpected timeout from process 8. Process will no longer be monitored.",
"processId": "8",
"{OriginalFormat}": "Unexpected timeout from process {processId}. Process will no longer be monitored."
},
"Scopes": []
}
{
"Timestamp": "2022-07-01T14:05:08.0970551Z",
"EventId": 5,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Monitoring.WebApi.OutputStreamResult",
"Message": "Written to HTTP stream.",
"State": {
"Message": "Written to HTTP stream.",
"{OriginalFormat}": "Written to HTTP stream."
},
"Scopes": [
{
"Message": "SpanId:30687ae247c20cfb, TraceId:ef5426eeaf912d015c59f0eebf0bf9e5, ParentId:2d02014ed370a9e5",
"SpanId": "30687ae247c20cfb",
"TraceId": "ef5426eeaf912d015c59f0eebf0bf9e5",
"ParentId": "2d02014ed370a9e5"
},
{
"Message": "ConnectionId:0HMIRCMP5M5GG",
"ConnectionId": "0HMIRCMP5M5GG"
},
{
"Message": "RequestPath:/livemetrics RequestId:0HMIRCMP5M5GG:00000002",
"RequestId": "0HMIRCMP5M5GG:00000002",
"RequestPath": "/livemetrics"
},
{
"Message": "Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.CaptureMetrics (Microsoft.Diagnostics.Monitoring.WebApi)",
"ActionId": "8accbb03-0591-4ce4-8407-5a631e02c09d",
"ActionName": "Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.CaptureMetrics (Microsoft.Diagnostics.Monitoring.WebApi)"
},
{
"Message": "ArtifactType:livemetrics ArtifactSource_ProcessId:8 ArtifactSource_RuntimeInstanceCookie:08449004064f4286bffdf0f33935d4b3",
"ArtifactType": "livemetrics",
"ArtifactSource_ProcessId": "8",
"ArtifactSource_RuntimeInstanceCookie": "08449004064f4286bffdf0f33935d4b3"
}
]
}
{
"Timestamp": "2022-07-01T14:05:08.1019287Z",
"EventId": 44,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "Stopping collection rules.",
"State": {
"Message": "Stopping collection rules.",
"{OriginalFormat}": "Stopping collection rules."
},
"Scopes": [
{
"Message": "TargetProcessId:8 TargetRuntimeInstanceCookie:08449004064f4286bffdf0f33935d4b3",
"TargetProcessId": "8",
"TargetRuntimeInstanceCookie": "08449004064f4286bffdf0f33935d4b3"
}
]
}
{
"Timestamp": "2022-07-01T14:05:08.2234506Z",
"EventId": 45,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService",
"Message": "All collection rules have stopped.",
"State": {
"Message": "All collection rules have stopped.",
"{OriginalFormat}": "All collection rules have stopped."
},
"Scopes": [
{
"Message": "TargetProcessId:8 TargetRuntimeInstanceCookie:08449004064f4286bffdf0f33935d4b3",
"TargetProcessId": "8",
"TargetRuntimeInstanceCookie": "08449004064f4286bffdf0f33935d4b3"
}
]
}
{
"Timestamp": "2022-07-01T14:10:03.5723478Z",
"EventId": 2,
"LogLevel": "Information",
"Category": "Microsoft.Diagnostics.Tools.Monitor.LiveJwtBearerHandler",
"Message": "Successfully validated the token.",
"State": {
"Message": "Successfully validated the token.",
"{OriginalFormat}": "Successfully validated the token."
},
"Scopes": [
{
"Message": "SpanId:cab9dde8f67d58fb, TraceId:f466c3e8ebb202208a5ec6d42f013591, ParentId:263c08c90ec8929c",
"SpanId": "cab9dde8f67d58fb",
"TraceId": "f466c3e8ebb202208a5ec6d42f013591",
"ParentId": "263c08c90ec8929c"
},
{
"Message": "ConnectionId:0HMIRCMP5M5GH",
"ConnectionId": "0HMIRCMP5M5GH"
},
{
"Message": "RequestPath:/livemetrics RequestId:0HMIRCMP5M5GH:00000002",
"RequestId": "0HMIRCMP5M5GH:00000002",
"RequestPath": "/livemetrics"
}
]
}
{
"Timestamp": "2022-07-01T14:10:03.5739343Z",
"EventId": 1,
"LogLevel": "Error",
"Category": "Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController",
"Message": "Request failed.",
"Exception": "System.ArgumentException: Unable to discover a target process. at Microsoft.Diagnostics.Monitoring.WebApi.DiagnosticServices.GetProcessAsync(DiagProcessFilter processFilterConfig, CancellationToken token) in /_/src/Microsoft.Diagnostics.Monitoring.WebApi/DiagnosticServices.cs:line 94 at Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.<>c__DisplayClass33_0`1.<<InvokeForProcess>b__0>d.MoveNext() in /_/src/Microsoft.Diagnostics.Monitoring.WebApi/Controllers/DiagController.cs:line 719 --- End of stack trace from previous location --- at Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagControllerExtensions.InvokeService[T](ControllerBase controller, Func`1 serviceCall, ILogger logger) in /_/src/Microsoft.Diagnostics.Monitoring.WebApi/Controllers/DiagControllerExtensions.cs:line 57",
"State": {
"Message": "Request failed.",
"{OriginalFormat}": "Request failed."
},
"Scopes": [
{
"Message": "SpanId:cab9dde8f67d58fb, TraceId:f466c3e8ebb202208a5ec6d42f013591, ParentId:263c08c90ec8929c",
"SpanId": "cab9dde8f67d58fb",
"TraceId": "f466c3e8ebb202208a5ec6d42f013591",
"ParentId": "263c08c90ec8929c"
},
{
"Message": "ConnectionId:0HMIRCMP5M5GH",
"ConnectionId": "0HMIRCMP5M5GH"
},
{
"Message": "RequestPath:/livemetrics RequestId:0HMIRCMP5M5GH:00000002",
"RequestId": "0HMIRCMP5M5GH:00000002",
"RequestPath": "/livemetrics"
},
{
"Message": "Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.CaptureMetrics (Microsoft.Diagnostics.Monitoring.WebApi)",
"ActionId": "8accbb03-0591-4ce4-8407-5a631e02c09d",
"ActionName": "Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.CaptureMetrics (Microsoft.Diagnostics.Monitoring.WebApi)"
},
{
"Message": "ArtifactType:livemetrics",
"ArtifactType": "livemetrics"
}
]
}
I have the same issue while trying to get a dump
{"Timestamp":"2022-10-13T11:00:30.8691337Z","EventId":5,"LogLevel":"Information","Category":"Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController","Message":"Written to HTTP stream.","State":{"Message":"Written to HTTP stream.","{OriginalFormat}":"Written to HTTP stream."},"Scopes":[{"Message":"SpanId:fa17baa5e04a3e29, TraceId:8ba32a3ccdfaf56225fbef1f131d92da, ParentId:0000000000000000","SpanId":"fa17baa5e04a3e29","TraceId":"8ba32a3ccdfaf56225fbef1f131d92da","ParentId":"0000000000000000"},{"Message":"ConnectionId:0HMLCVOLKK104","ConnectionId":"0HMLCVOLKK104"},{"Message":"RequestPath:/processes RequestId:0HMLCVOLKK104:00000002","RequestId":"0HMLCVOLKK104:00000002","RequestPath":"/processes"},{"Message":"Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.GetProcesses (Microsoft.Diagnostics.Monitoring.WebApi)","ActionId":"77d736b1-5d9e-4014-b50b-846d360b12c6","ActionName":"Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.GetProcesses (Microsoft.Diagnostics.Monitoring.WebApi)"}]}
{"Timestamp":"2022-10-13T11:00:56.6623617Z","EventId":13,"LogLevel":"Error","Category":"Microsoft.AspNetCore.Server.Kestrel","Message":"Connection id \u00220HMLCVOLKK104\u0022, Request id \u00220HMLCVOLKK104:00000003\u0022: An unhandled exception was thrown by the application.","Exception":"System.IO.EndOfStreamException: Attempted to read past the end of the stream. at Microsoft.Diagnostics.NETCore.Client.StreamExtensions.ReadBytesAsync(Stream stream, Int32 length, CancellationToken cancellationToken) at Microsoft.Diagnostics.NETCore.Client.IpcHeader.ParseAsync(Stream stream, CancellationToken cancellationToken) at Microsoft.Diagnostics.NETCore.Client.IpcMessage.ParseAsync(Stream stream, CancellationToken cancellationToken) at Microsoft.Diagnostics.NETCore.Client.IpcClient.SendMessageGetContinuationAsync(IpcEndpoint endpoint, IpcMessage message, CancellationToken cancellationToken) at Microsoft.Diagnostics.NETCore.Client.IpcClient.SendMessageAsync(IpcEndpoint endpoint, IpcMessage message, CancellationToken cancellationToken) at Microsoft.Diagnostics.NETCore.Client.DiagnosticsClient.WriteDumpAsync(DumpType dumpType, String dumpPath, WriteDumpFlags flags, CancellationToken token) at Microsoft.Diagnostics.Monitoring.WebApi.DumpService.DumpAsync(IEndpointInfo endpointInfo, DumpType mode, CancellationToken token) in /_/src/Microsoft.Diagnostics.Monitoring.WebApi/DumpService.cs:line 77 at Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.\u003C\u003Ec__DisplayClass15_0.\u003C\u003CCaptureDump\u003Eb__0\u003Ed.MoveNext() in /_/src/Microsoft.Diagnostics.Monitoring.WebApi/Controllers/DiagController.cs:line 223 --- End of stack trace from previous location --- at Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.\u003C\u003Ec__DisplayClass31_0.\u003C\u003CInvokeForProcess\u003Eb__0\u003Ed.MoveNext() in /_/src/Microsoft.Diagnostics.Monitoring.WebApi/Controllers/DiagController.cs:line 702 --- End of stack trace from previous location --- at Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.\u003C\u003Ec__DisplayClass33_0\u00601.\u003C\u003CInvokeForProcess\u003Eb__0\u003Ed.MoveNext() in /_/src/Microsoft.Diagnostics.Monitoring.WebApi/Controllers/DiagController.cs:line 734 --- End of stack trace from previous location --- at Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagControllerExtensions.InvokeService[T](ControllerBase controller, Func\u00601 serviceCall, ILogger logger) in /_/src/Microsoft.Diagnostics.Monitoring.WebApi/Controllers/DiagControllerExtensions.cs:line 91 at Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.InvokeForProcess[T](Func\u00602 func, Nullable\u00601 processKey, String artifactType) in /_/src/Microsoft.Diagnostics.Monitoring.WebApi/Controllers/DiagController.cs:line 724 at Microsoft.Diagnostics.Monitoring.WebApi.Controllers.DiagController.InvokeForProcess(Func\u00602 func, Nullable\u00601 processKey, String artifactType) in /_/src/Microsoft.Diagnostics.Monitoring.WebApi/Controllers/DiagController.cs:line 702 at lambda_method15(Closure , Object ) at Microsoft.AspNetCore.Mvc.Infrastructure.ActionMethodExecutor.TaskOfActionResultExecutor.Execute(IActionResultTypeMapper mapper, ObjectMethodExecutor executor, Object controller, Object[] arguments) at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.\u003CInvokeActionMethodAsync\u003Eg__Awaited|12_0(ControllerActionInvoker invoker, ValueTask\u00601 actionResultValueTask) at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.\u003CInvokeNextActionFilterAsync\u003Eg__Awaited|10_0(ControllerActionInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted) at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Rethrow(ActionExecutedContextSealed context) at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Next(State\u0026 next, Scope\u0026 scope, Object\u0026 state, Boolean\u0026 isCompleted) at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.\u003CInvokeInnerFilterAsync\u003Eg__Awaited|13_0(ControllerActionInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted) at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.\u003CInvokeNextExceptionFilterAsync\u003Eg__Awaited|26_0(ResourceInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted) at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Rethrow(ExceptionContextSealed context) at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Next(State\u0026 next, Scope\u0026 scope, Object\u0026 state, Boolean\u0026 isCompleted) at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.\u003CInvokeFilterPipelineAsync\u003Eg__Awaited|20_0(ResourceInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted) at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.\u003CInvokeAsync\u003Eg__Awaited|17_0(ResourceInvoker invoker, Task task, IDisposable scope) at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.\u003CInvokeAsync\u003Eg__Awaited|17_0(ResourceInvoker invoker, Task task, IDisposable scope) at Microsoft.AspNetCore.Routing.EndpointMiddleware.\u003CInvoke\u003Eg__AwaitRequestTask|6_0(Endpoint endpoint, Task requestTask, ILogger logger) at Microsoft.Diagnostics.Tools.Monitor.RequestLimitMiddleware.Invoke(HttpContext context) in /_/src/Tools/dotnet-monitor/RequestLimitMiddleware.cs:line 73 at Microsoft.AspNetCore.Authorization.Policy.AuthorizationMiddlewareResultHandler.HandleAsync(RequestDelegate next, HttpContext context, AuthorizationPolicy policy, PolicyAuthorizationResult authorizeResult) at Microsoft.AspNetCore.Authorization.AuthorizationMiddleware.Invoke(HttpContext context) at Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.Invoke(HttpContext context) at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol.ProcessRequests[TContext](IHttpApplication\u00601 application)","State":{"Message":"Connection id \u00220HMLCVOLKK104\u0022, Request id \u00220HMLCVOLKK104:00000003\u0022: An unhandled exception was thrown by the application.","ConnectionId":"0HMLCVOLKK104","TraceIdentifier":"0HMLCVOLKK104:00000003","{OriginalFormat}":"Connection id \u0022{ConnectionId}\u0022, Request id \u0022{TraceIdentifier}\u0022: An unhandled exception was thrown by the application."},"Scopes":[{"Message":"SpanId:a2b8f55f3888f452, TraceId:19aae55dbdb5a27ca561e138731fd0e2, ParentId:0000000000000000","SpanId":"a2b8f55f3888f452","TraceId":"19aae55dbdb5a27ca561e138731fd0e2","ParentId":"0000000000000000"},{"Message":"ConnectionId:0HMLCVOLKK104","ConnectionId":"0HMLCVOLKK104"},{"Message":"RequestPath:/dump RequestId:0HMLCVOLKK104:00000003","RequestId":"0HMLCVOLKK104:00000003","RequestPath":"/dump"}]}
{"Timestamp":"2022-10-13T11:00:58.2183792Z","EventId":52,"LogLevel":"Warning","Category":"Microsoft.Diagnostics.Tools.Monitor.ServerEndpointInfoSource","Message":"Unexpected timeout from process 13. Process will no longer be monitored.","State":{"Message":"Unexpected timeout from process 13. Process will no longer be monitored.","processId":"13","{OriginalFormat}":"Unexpected timeout from process {processId}. Process will no longer be monitored."},"Scopes":[]}
{"Timestamp":"2022-10-13T11:00:58.2221081Z","EventId":44,"LogLevel":"Information","Category":"Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService","Message":"Stopping collection rules.","State":{"Message":"Stopping collection rules.","{OriginalFormat}":"Stopping collection rules."},"Scopes":[{"Message":"TargetProcessId:13 TargetRuntimeInstanceCookie:2dd906d0646846b9b3ce0b2f07b07a68","TargetProcessId":"13","TargetRuntimeInstanceCookie":"2dd906d0646846b9b3ce0b2f07b07a68"}]}
{"Timestamp":"2022-10-13T11:00:58.2237482Z","EventId":45,"LogLevel":"Information","Category":"Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService","Message":"All collection rules have stopped.","State":{"Message":"All collection rules have stopped.","{OriginalFormat}":"All collection rules have stopped."},"Scopes":[{"Message":"TargetProcessId:13 TargetRuntimeInstanceCookie:2dd906d0646846b9b3ce0b2f07b07a68","TargetProcessId":"13","TargetRuntimeInstanceCookie":"2dd906d0646846b9b3ce0b2f07b07a68"}]}
@xsoheilalizadeh
I had also exactly the same problem : System.IO.EndOfStreamException: Attempted to read past the end of the stream.
/gcdump and /trace worked perfectly. Only with /dump the same error occurred.
It turned out that my resources for the target container were too tight, and the pod crashed immediately when i invoked the /dump endpoint (maybe some kind of Out Of Memory).
After I removed the resource requests/limits from the yaml everything worked as desired :-)
on ubuntu 22.04 with dotnet 7:
# terminal 1
dotnet tool install --global dotnet-monitor
dotnet monitor collect -u http://localhost:52324 --no-auth --diagnostic-port /tmp/myport.sock
# terminal 2
dotnet publish -c Release -o pub --self-contained
DOTNET_DiagnosticPorts=/tmp/myport.sock pub/helloworldapp
in terminal 1 the logs showed the target process id is 15180 but timedout:
06:15:32 warn: Microsoft.Diagnostics.Tools.Monitor.ServerEndpointInfoSource[52] Unexpected timeout from process 15180. Process will no longer be monitored. 06:15:32 info: Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService[44] => TargetProcessId:16045 TargetRuntimeInstanceCookie:3124459885fd4a40b0709b167d709e50 Stopping collection rules. 06:15:32 info: Microsoft.Diagnostics.Tools.Monitor.CollectionRules.CollectionRuleService[45] => TargetProcessId:15180 TargetRuntimeInstanceCookie:3124459885fd4a40b0709b167d709e50 All collection rules have stopped.
i opened http://localhost:52324/logs?pid=15180
and got this:
{"status":400,"detail":"Unable to discover a target process."}
same for /trace?pid=..
and /trace?pid=..
and i've tested multiple times, same result with different pid.
workaround:
# shutdown dotnet-monitor in terminal 1 and use dotnet-trace instead
dotnet tool install --global dotnet-trace
dotnet-trace collect --diagnostic-port /tmp/myport.sock --format speedscope
# terminal 2
DOTNET_DiagnosticPorts=/tmp/myport.sock pub/helloworldapp
works flawlessly. i can visualize the generated report at https://www.speedscope.app/
so this has to be dotnet-monitor bug.
so this has to be dotnet-monitor bug.
That doesn't prove very much in this case:
dotnet-trace
tool makes at most three diagnostic connections to the target process (one to start the event session, one to optionally resume the runtime, and one to end the event session).dotnet-monitor
tool makes connections to the target application every three seconds to determine liveliness plus whatever operations you perform (listing processes, capturing artifacts, etc). If the liveliness check fails, then dotnet-monitor gives up on monitoring the process (because if the diagnostic channel doesn't respond to a liveliness check, it isn't going to respond to a more complicated operation).Previous reports suggest that the target application is no longer monitored after about 30 minutes; that would be 600 liveliness probes (30 minutes / 3 seconds-per-liveliness probes) before failure to respond. That's a lot more socket connections than what dotnet-trace
does.
My current theory is that the target runtime doesn't always respond correctly when a diagnostic connection is terminated without a command (e.g. the liveliness probe), but I haven't been able to reproduce the issue yet.
If anyone can reproduce the problem and capture a dump of both dotnet-monitor and the target process, that would be very helpful.
Hi,
We have dotnetmonitor set up on ECS Fargate. Running in listen mode collecting metrics every X. Our set up is a single dotnetmonitor side car inside each launched task with many tasks being launched. It stops working for us on some tasks after a few hours with the following error:
Then all subsequent requests get this error:
Note the main container itself keeps on working just fine and is processing requests without any issues. Per metrics captured before the error I do not see any abnormal memory/cpu/etc usage compared to the other tasks where dotnet-monitor keeps on working.
Here is our ecs task definition (the dotnetmonitor config values are under 'Environment'):
And dockerfile to customize the default dotnet monitor container: