dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.28k stars 1.59k forks source link

Dart VM shutdown hangs #35192

Open mraleph opened 6 years ago

mraleph commented 6 years ago

From the stack traces it seems that we are waiting for service isolate to shutdown - but in fact there are no service isolate running.

FAILED: dartkp-dart_precompiled release_x64 standalone_2/io/stdout_stderr_non_blocking_test
Expected: Pass
Actual: Timeout

--- Command "vm_compile_to_kernel" (took 01.000867s):
DART_CONFIGURATION=ReleaseX64 /b/s/w/ir/pkg/vm/tool/gen_kernel --aot --platform=out/ReleaseX64/vm_platform_strong.dill -o /b/s/w/ir/out/ReleaseX64/generated_compilations/dartkp/tests_standalone_2_io_stdout_stderr_non_blocking_test/out.dill /b/s/w/ir/tests/standalone_2/io/stdout_stderr_non_blocking_test.dart --packages=/b/s/w/ir/.packages -Ddart.developer.causal_async_stacks=true

exit code:
0

--- Command "precompiler" (took 03.000341s):
DART_CONFIGURATION=ReleaseX64 out/ReleaseX64/gen_snapshot --snapshot-kind=app-aot-assembly --assembly=/b/s/w/ir/out/ReleaseX64/generated_compilations/dartkp/tests_standalone_2_io_stdout_stderr_non_blocking_test/out.S --obfuscate --sync-async --ignore-unrecognized-flags --packages=/b/s/w/ir/.packages /b/s/w/ir/out/ReleaseX64/generated_compilations/dartkp/tests_standalone_2_io_stdout_stderr_non_blocking_test/out.dill

exit code:
0

stderr:
Warning: This VM has been configured to obfuscate symbol information which violates the Dart standard.
         See dartbug.com/30524 for more information.

--- Command "remove_kernel_file" (took 12ms):
DART_CONFIGURATION=ReleaseX64 rm /b/s/w/ir/out/ReleaseX64/generated_compilations/dartkp/tests_standalone_2_io_stdout_stderr_non_blocking_test/out.dill

exit code:
0

--- Command "assemble" (took 01.000622s):
DART_CONFIGURATION=ReleaseX64 gcc -m64 -shared -nostdlib -o /b/s/w/ir/out/ReleaseX64/generated_compilations/dartkp/tests_standalone_2_io_stdout_stderr_non_blocking_test/out.aotsnapshot /b/s/w/ir/out/ReleaseX64/generated_compilations/dartkp/tests_standalone_2_io_stdout_stderr_non_blocking_test/out.S

exit code:
0

--- Command "remove_assembly" (took 15ms):
DART_CONFIGURATION=ReleaseX64 rm /b/s/w/ir/out/ReleaseX64/generated_compilations/dartkp/tests_standalone_2_io_stdout_stderr_non_blocking_test/out.S

exit code:
0

--- Command "vm" (took 02:00.000052s):
DART_CONFIGURATION=ReleaseX64 out/ReleaseX64/dart_precompiled_runtime --obfuscate --sync-async --ignore-unrecognized-flags --packages=/b/s/w/ir/.packages /b/s/w/ir/out/ReleaseX64/generated_compilations/dartkp/tests_standalone_2_io_stdout_stderr_non_blocking_test/out.aotsnapshot

exit code:
-15

diagnostics:
Process list including children: [10640]
Trying to capture stack trace for pid 10640
PID 10640 - process
TID 10640:
#0  0x00007f95ff566404 pthread_cond_wait@@GLIBC_2.3.2
#1  0x000055f0f82b3cc8 dart::Monitor::WaitMicros(long)
#2  0x000055f0f830d23a dart::ServiceIsolate::Shutdown()
#3  0x000055f0f81de6af dart::Dart::Cleanup()
#4  0x000055f0f7fe9557 dart::bin::main(int, char**)
#5  0x000055f0f7fe9bb9 main
#6  0x00007f95fe78af45 __libc_start_main
#7  0x000055f0f7fe8029 _start
TID 10641:
#0  0x00007f95fe8676d3 epoll_wait
#1  0x000055f0f7fef725 dart::bin::EventHandlerImplementation::Poll(unsigned long)
#2  0x000055f0f800eece dart::bin::ThreadStart(void*)
#3  0x00007f95ff562184 start_thread
#4  0x00007f95fe86703d __clone

stdout:
Hello
Hello
Hello
Hello
Hello
Hello

stderr:
Hello
Hello
Hello
Hello
Hello
Hello

--- Re-run this test:
python tools/test.py -m release -n dartkp-obfuscate-linux-release-x64 --write-logs standalone_2/io/stdout_stderr_non_blocking_test
mraleph commented 6 years ago

I am going to try implementing core dump generation and archiving for tests that time out.

mraleph commented 6 years ago

Based on stack traces I think this is the same bug that @mkustermann has fixed in 83ba3973683981286455869dd473439aa7a8d3ff - we are not propagating UnwindError correctly.

aam commented 5 years ago

This stdout_stderr_non_blocking_test test timed out on shutdown again( https://logs.chromium.org/logs/dart/buildbucket/cr-buildbucket.appspot.com/8928428248600968144/+/steps/vm_tests_shard_1/0/stdout ), so reopening.

mkustermann commented 5 years ago

@rmacnak-google seems to have found another place where we might've swallowed UnwindErrors

mkustermann commented 5 years ago

Even after @rmacnak-google 's change in 07f43510a2d2f26262c382ce7cb102fed99c539e we have seen another case just now. From log:

FAILED: dartkp-dart_precompiled release_simarm standalone_2/io/stdout_stderr_non_blocking_test
Expected: Pass
Actual: Timeout

--- Command "vm_compile_to_kernel" (took 01.000775s):
DART_CONFIGURATION=ReleaseSIMARM /b/s/w/ir/pkg/vm/tool/gen_kernel --aot --platform=out/ReleaseSIMARM/vm_platform_strong.dill -o /b/s/w/ir/out/ReleaseSIMARM/generated_compilations/dartkp/tests_standalone_2_io_stdout_stderr_non_blocking_test/out.dill /b/s/w/ir/tests/standalone_2/io/stdout_stderr_non_blocking_test.dart --packages=/b/s/w/ir/.packages -Ddart.developer.causal_async_stacks=true

exit code:
0

--- Command "precompiler" (took 02.000776s):
DART_CONFIGURATION=ReleaseSIMARM out/ReleaseSIMARM/gen_snapshot --snapshot-kind=app-aot-blobs --blobs_container_filename=/b/s/w/ir/out/ReleaseSIMARM/generated_compilations/dartkp/tests_standalone_2_io_stdout_stderr_non_blocking_test/out.aotsnapshot --sync-async --ignore-unrecognized-flags --packages=/b/s/w/ir/.packages /b/s/w/ir/out/ReleaseSIMARM/generated_compilations/dartkp/tests_standalone_2_io_stdout_stderr_non_blocking_test/out.dill

exit code:
0

--- Command "remove_kernel_file" (took 13ms):
DART_CONFIGURATION=ReleaseSIMARM rm /b/s/w/ir/out/ReleaseSIMARM/generated_compilations/dartkp/tests_standalone_2_io_stdout_stderr_non_blocking_test/out.dill

exit code:
0

--- Command "vm" (took 08:00.000096s):
DART_CONFIGURATION=ReleaseSIMARM out/ReleaseSIMARM/dart_precompiled_runtime --sync-async --ignore-unrecognized-flags --packages=/b/s/w/ir/.packages /b/s/w/ir/out/ReleaseSIMARM/generated_compilations/dartkp/tests_standalone_2_io_stdout_stderr_non_blocking_test/out.aotsnapshot

exit code:
-15

diagnostics:
Process list including children: [12883]
Trying to capture stack trace for pid 12883
PID 12883 - process
TID 12883:
#0  0xf775bcd9 __kernel_vsyscall
#1  0xf772cd0b pthread_cond_wait@@GLIBC_2.3.2
#2  0x0089dc65 dart::Monitor::WaitMicros(long long)
#3  0x0089dbb7 dart::Monitor::Wait(long long)
#4  0x008f0018 dart::ServiceIsolate::Shutdown()
#5  0x007e6553 dart::Dart::Cleanup()
#6  0x0094ddbe Dart_Cleanup
#7  0x006425cd dart::bin::main(int, char**)
#8  0x00642c84 main
#9  0xf753aaf3 __libc_start_main
#10 0x00641021 _start
TID 12884:
#0  0xf775bcd9 __kernel_vsyscall
#1  0xf760fee6 epoll_wait
#2  0x00648791 dart::bin::EventHandlerImplementation::Poll(unsigned int)
#3  0x00668e44 dart::bin::ThreadStart(void*)
#4  0xf7728f72 start_thread
#5  0xf760f43e __clone

eu-stack: dwfl_thread_getframes tid 12883 at 0x641020 in /b/s/w/ir/out/ReleaseSIMARM/dart_precompiled_runtime: no matching address range

stdout:
Hello
Hello
Hello
Hello
Hello
Hello

stderr:
Hello
Hello
Hello
Hello
Hello
Hello

--- Re-run this test:
python tools/test.py -n dartkp-linux-release-simarm standalone_2/io/stdout_stderr_non_blocking_test
dcharkes commented 5 years ago

Still happening: log, log, and log.

dcharkes commented 5 years ago

Again: log

Marking in status file.

sstrickl commented 5 years ago

Another timeout (log) that looks similar:

FAILED: app_jitk-vm release_x64 standalone_2/io/http_bind_test
Expected: Pass
Actual: Timeout

--- Command "app_jit" (took 01.000077s):
DART_CONFIGURATION=ReleaseX64 out/ReleaseX64/dart --snapshot=/b/s/w/ir/cache/builder/sdk/out/ReleaseX64/generated_compilations/app_jitk/tests_standalone_2_io_http_bind_test/out.jitsnapshot --snapshot-kind=app-jit --ignore-unrecognized-flags --packages=/b/s/w/ir/cache/builder/sdk/.packages /b/s/w/ir/cache/builder/sdk/tests/standalone_2/io/http_bind_test.dart

exit code:
0

stdout:
unittest-suite-wait-for-done
unittest-suite-success

--- Command "vm" (took 01:00.000380s):
DART_CONFIGURATION=ReleaseX64 out/ReleaseX64/dart --ignore-unrecognized-flags --packages=/b/s/w/ir/cache/builder/sdk/.packages /b/s/w/ir/cache/builder/sdk/out/ReleaseX64/generated_compilations/app_jitk/tests_standalone_2_io_http_bind_test/out.jitsnapshot

exit code:
-15

diagnostics:
Process list including children: [5328]
Trying to capture stack trace for pid 5328
PID 5328 - process
TID 5328:
#0  0x00007fe549543404 pthread_cond_wait@@GLIBC_2.3.2
#1  0x000056294a0cc158 dart::Monitor::WaitMicros(long)
#2  0x000056294a2d0f3a Dart_RunLoop
#3  0x0000562949db3afd dart::bin::RunMainIsolate(char const*, dart::bin::CommandLineOptions*)
#4  0x0000562949db4807 dart::bin::main(int, char**)
#5  0x0000562949db5119 main
#6  0x00007fe548c81f45 __libc_start_main
#7  0x0000562949db3029 _start
TID 5329:
#0  0x00007fe548d5e6d3 epoll_wait
#1  0x0000562949dbc095 dart::bin::EventHandlerImplementation::Poll(unsigned long)
#2  0x0000562949ddb6de dart::bin::ThreadStart(void*)
#3  0x00007fe54953f184 start_thread
#4  0x00007fe548d5e03d __clone
TID 5344:
#0  0x00007fe549543404 pthread_cond_wait@@GLIBC_2.3.2
#1  0x000056294a0cc158 dart::Monitor::WaitMicros(long)
#2  0x000056294a29ccaa dart::BackgroundCompiler::Run()
#3  0x000056294a1654f7 dart::ThreadPool::Worker::Loop()
#4  0x000056294a165383 dart::ThreadPool::Worker::Main(unsigned long)
#5  0x000056294a0cba49 dart::ThreadStart(void*)
#6  0x00007fe54953f184 start_thread
#7  0x00007fe548d5e03d __clone

stdout:
unittest-suite-wait-for-done

--- Re-run this test:
python tools/test.py -n app_jitk-linux-release-x64 standalone_2/io/http_bind_test
sstrickl commented 5 years ago

And another log from today for io/http_bind_test, marking it as flaky now.

aam commented 5 years ago

Another failure

dcharkes commented 5 years ago

Another failure log

From the flakiness dashboard:

name configuration outcomes expected current run timeout pass          
standalone_2/io/http_bind_test dartk-optcounter-linux-release-ia32 Pass,Timeout Pass 64 1 64          
standalone_2/io/http_bind_test dartkb-compile-linux-release-x64 Pass,Timeout Pass 31 1 31          
standalone_2/io/http_bind_test dartkp-linux-debug-x64 Pass,Timeout Pass 54 1 54          
standalone_2/io/stdout_stderr_non_blocking_test dartkp-linux-release-simarm Pass,Timeout Pass 22 8 273          
standalone_2/io/stdout_stderr_non_blocking_test dartkp-linux-release-x64 Pass,Timeout Pass 86 1 86          
dcharkes commented 5 years ago

standalone_2/io/http_bind_test also failed in configuration dartk-linux-release-simarm log

Flakyness dashboard filtered on name=standalone_2/io and outcomes=Pass,Timeout:

Last New Result Seen Name Configuration Outcomes Current Status Current Run Pass CTE RTE fail crash TO
2019-02-08 01:23:35 standalone_2/io/http_bind_test dartk-linux-release-simarm Pass,Timeout Pass 9 9         1  
2019-01-31 18:16:52 standalone_2/io/http_read_test dartkp-android-release-arm Pass,Timeout Pass 99 99         1  
2019-02-04 16:00:26 standalone_2/io/process_detached_test dartk-asan-linux-release-x64 Pass,Timeout Pass 62 62         1  
2019-02-08 01:48:47 standalone_2/io/socket_connect_stream_data_close_cancel_test dartk-mac-debug-x64 Pass,Timeout Pass 24 27         1  
2019-02-07 11:27:56 standalone_2/io/socket_connect_stream_data_close_cancel_test dartkp-mac-release-simarm64 Pass,Timeout Pass 64 67         1  
2019-02-04 10:49:49 standalone_2/io/stdout_stderr_non_blocking_test dartkp-no-bare-linux-release-simarm Pass,Timeout Pass 4 83         2  
2019-01-30 18:15:13 standalone_2/io/stdout_stderr_non_blocking_test dartkp-linux-release-simarm Pass,Timeout Pass 73 138         5  
2019-01-29 06:02:36 standalone_2/io/stdout_stderr_non_blocking_test dartkp-linux-release-simarm64 Pass,Timeout Pass 99 99         1  
  standalone_2/io/stdout_stderr_non_blocking_test dartkp-bare-linux-release-simarm Pass,Timeout Pass 22 199         5  
  standalone_2/io/stdout_stderr_non_blocking_test dartkp-bare-linux-release-simarm64 Pass,Timeout Pass 89 89