dotnet / crank

Benchmarking infrastructure for applications
MIT License
976 stars 103 forks source link

Agent failure when collecting timeout dumps #652

Open sebastienros opened 10 months ago

sebastienros commented 10 months ago
[15:56:20 INF] Processing job 'application' (585) in state Failed
[15:56:20 INF] Driver stopping job '585'
[15:56:20 INF] Job failed
[15:56:20 INF] Failed -> Failed (application:585)
[15:56:21 INF] Processing job 'application' (585) in state Failed
[15:56:21 INF] Job failed
[15:56:21 INF] Failed -> Failed (application:585)
[15:56:22 INF] Driver fetching published application '585'
[15:56:22 INF] Processing job 'application' (585) in state Failed
[15:56:22 INF] Job failed
[15:56:22 INF] Failed -> Failed (application:585)
[15:56:23 INF] Processing job 'application' (585) in state Failed
[15:56:23 INF] Job failed
[15:56:23 INF] Failed -> Failed (application:585)
[15:56:24 INF] Processing job 'application' (585) in state Failed
[15:56:24 INF] Job failed
[15:56:24 INF] Failed -> Failed (application:585)
[15:56:25 INF] Processing job 'application' (585) in state Failed
[15:56:25 INF] Job failed
[15:56:25 INF] Failed -> Failed (application:585)
[15:56:26 INF] Processing job 'application' (585) in state Failed
[15:56:26 INF] Job failed
[15:56:26 INF] Failed -> Failed (application:585)
[15:56:27 INF] Processing job 'application' (585) in state Failed
[15:56:27 INF] Job failed
[15:56:27 INF] Failed -> Failed (application:585)
[15:56:28 INF] Processing job 'application' (585) in state Failed
[15:56:28 INF] Job failed
[15:56:28 INF] Failed -> Failed (application:585)
[15:56:29 INF] Processing job 'application' (585) in state Failed
[15:56:29 INF] Job failed
[15:56:29 INF] Failed -> Failed (application:585)
[15:56:30 INF] Processing job 'application' (585) in state Failed
[15:56:30 INF] Job failed
[15:56:30 INF] Failed -> Failed (application:585)
[15:56:31 INF] Processing job 'application' (585) in state Failed
[15:56:31 INF] Job failed
[15:56:31 INF] Failed -> Failed (application:585)
[15:56:31 INF] Download requested: 'dump'
[15:56:31 INF] Driver deleting job '585'
[15:56:32 INF] Processing job 'application' (585) in state Deleting
[15:56:32 INF] Deleting job 'application' (585)
[15:56:32 INF] Collecting dump (application:585)
[15:56:32 INF] Writing full to C:\Users\Administrator\AppData\Local\Temp\tmp7E71.tmp
[15:56:32 INF] Deleting directory 'C:\Users\Administrator\AppData\Local\Temp\benchmarks-agent\benchmarks-server-7860\tsj1umix.lgv'
[15:56:32 INF] Job failed
[15:56:33 INF] Deleting -> Deleted (application:585)
[15:56:33 ERR] Unexpected error
System.ArgumentException: Process with an Id of 4824 is not running.
   at System.Diagnostics.Process.GetProcessById(Int32 processId, String machineName)
   at Microsoft.Crank.Agent.Dumper.Collect(Int32 processId, String outputFilePath, DumpTypeOption type) in C:\code\crank\src\Microsoft.Crank.Agent\Dumper.cs:line 49
   at Microsoft.Crank.Agent.Startup.<>c__DisplayClass81_2.<<ProcessJobs>g__StopJobAsync|6>d.MoveNext() in C:\code\crank\src\Microsoft.Crank.Agent\Startup.cs:line 1709
--- End of stack trace from previous location ---
   at Microsoft.Crank.Agent.Startup.<>c__DisplayClass81_2.<<ProcessJobs>g__DeleteJobAsync|7>d.MoveNext() in C:\code\crank\src\Microsoft.Crank.Agent\Startup.cs:line 1745
--- End of stack trace from previous location ---
   at Microsoft.Crank.Agent.Startup.<>c__DisplayClass81_2.<<ProcessJobs>g__DeleteJobAsync|7>d.MoveNext() in C:\code\crank\src\Microsoft.Crank.Agent\Startup.cs:line 1745
--- End of stack trace from previous location ---
   at Microsoft.Crank.Agent.Startup.ProcessJobs(String hostname, String dockerHostname, CancellationToken cancellationToken) in C:\code\crank\src\Microsoft.Crank.Agent\Startup.cs:line 1776
[15:56:33 INF] Cancelling remaining jobs
[15:56:33 INF] Cleaning up temporary folder...
[15:56:33 INF] [C:\Users\Administrator\AppData\Local\Temp\benchmarks-agent\benchmarks-server-7860\v4mng0fi.yf3] C:\Users\Administrator\AppData\Local\Temp\benchmarks-agent\benchmarks-server-7860\v4mng0fi.yf3\dotnet.exe build-server shutdown
[15:56:33 INF] Shutting down MSBuild server...
[15:56:33 INF] Shutting down VB/C# compiler server...
[15:56:33 INF] VB/C# compiler server shut down successfully.
[15:56:33 INF] Job failed
[15:56:33 INF] Deleted -> Failed (application:585)
[15:56:34 INF] MSBuild server shut down successfully.
[15:56:34 INF] Deleting directory 'C:\Users\Administrator\AppData\Local\Temp\benchmarks-agent\benchmarks-server-7860'
[15:56:34 INF] Job failed
[15:56:34 INF] Failed -> Failed (application:585)
[15:56:35 INF] Job failed
[15:56:35 INF] Failed -> Failed (application:585)
[15:56:36 INF] Job failed
[15:56:36 INF] Failed -> Failed (application:585)
[15:56:37 INF] Job failed
[15:56:37 INF] Failed -> Failed (application:585)
[15:56:38 INF] Job failed
[15:56:38 INF] Failed -> Failed (application:585)
[15:56:39 INF] Error, retrying ...
[15:56:39 INF] Error, retrying ...
[15:56:39 INF] Job failed
[15:56:39 INF] Failed -> Failed (application:585)
[15:56:40 INF] Error, retrying ...
[15:56:40 INF] Error, retrying ...
[15:56:40 INF] Job failed
[15:56:40 INF] Failed -> Failed (application:585)

/cc @mrsharm Was that you?

mrsharm commented 10 months ago

Yes - was trying to test the new functionality of collecting a dump on a hang. What did I do wrong now?

sebastienros commented 10 months ago

Not saying you did anything wrong, would like to know if that was it to try and repro the issue in order to fix it.