Open mraleph opened 6 years ago
I am going to try implementing core dump generation and archiving for tests that time out.
Based on stack traces I think this is the same bug that @mkustermann has fixed in 83ba3973683981286455869dd473439aa7a8d3ff - we are not propagating UnwindError
correctly.
This stdout_stderr_non_blocking_test test timed out on shutdown again( https://logs.chromium.org/logs/dart/buildbucket/cr-buildbucket.appspot.com/8928428248600968144/+/steps/vm_tests_shard_1/0/stdout ), so reopening.
@rmacnak-google seems to have found another place where we might've swallowed UnwindErrors
Even after @rmacnak-google 's change in 07f43510a2d2f26262c382ce7cb102fed99c539e we have seen another case just now. From log:
FAILED: dartkp-dart_precompiled release_simarm standalone_2/io/stdout_stderr_non_blocking_test
Expected: Pass
Actual: Timeout
--- Command "vm_compile_to_kernel" (took 01.000775s):
DART_CONFIGURATION=ReleaseSIMARM /b/s/w/ir/pkg/vm/tool/gen_kernel --aot --platform=out/ReleaseSIMARM/vm_platform_strong.dill -o /b/s/w/ir/out/ReleaseSIMARM/generated_compilations/dartkp/tests_standalone_2_io_stdout_stderr_non_blocking_test/out.dill /b/s/w/ir/tests/standalone_2/io/stdout_stderr_non_blocking_test.dart --packages=/b/s/w/ir/.packages -Ddart.developer.causal_async_stacks=true
exit code:
0
--- Command "precompiler" (took 02.000776s):
DART_CONFIGURATION=ReleaseSIMARM out/ReleaseSIMARM/gen_snapshot --snapshot-kind=app-aot-blobs --blobs_container_filename=/b/s/w/ir/out/ReleaseSIMARM/generated_compilations/dartkp/tests_standalone_2_io_stdout_stderr_non_blocking_test/out.aotsnapshot --sync-async --ignore-unrecognized-flags --packages=/b/s/w/ir/.packages /b/s/w/ir/out/ReleaseSIMARM/generated_compilations/dartkp/tests_standalone_2_io_stdout_stderr_non_blocking_test/out.dill
exit code:
0
--- Command "remove_kernel_file" (took 13ms):
DART_CONFIGURATION=ReleaseSIMARM rm /b/s/w/ir/out/ReleaseSIMARM/generated_compilations/dartkp/tests_standalone_2_io_stdout_stderr_non_blocking_test/out.dill
exit code:
0
--- Command "vm" (took 08:00.000096s):
DART_CONFIGURATION=ReleaseSIMARM out/ReleaseSIMARM/dart_precompiled_runtime --sync-async --ignore-unrecognized-flags --packages=/b/s/w/ir/.packages /b/s/w/ir/out/ReleaseSIMARM/generated_compilations/dartkp/tests_standalone_2_io_stdout_stderr_non_blocking_test/out.aotsnapshot
exit code:
-15
diagnostics:
Process list including children: [12883]
Trying to capture stack trace for pid 12883
PID 12883 - process
TID 12883:
#0 0xf775bcd9 __kernel_vsyscall
#1 0xf772cd0b pthread_cond_wait@@GLIBC_2.3.2
#2 0x0089dc65 dart::Monitor::WaitMicros(long long)
#3 0x0089dbb7 dart::Monitor::Wait(long long)
#4 0x008f0018 dart::ServiceIsolate::Shutdown()
#5 0x007e6553 dart::Dart::Cleanup()
#6 0x0094ddbe Dart_Cleanup
#7 0x006425cd dart::bin::main(int, char**)
#8 0x00642c84 main
#9 0xf753aaf3 __libc_start_main
#10 0x00641021 _start
TID 12884:
#0 0xf775bcd9 __kernel_vsyscall
#1 0xf760fee6 epoll_wait
#2 0x00648791 dart::bin::EventHandlerImplementation::Poll(unsigned int)
#3 0x00668e44 dart::bin::ThreadStart(void*)
#4 0xf7728f72 start_thread
#5 0xf760f43e __clone
eu-stack: dwfl_thread_getframes tid 12883 at 0x641020 in /b/s/w/ir/out/ReleaseSIMARM/dart_precompiled_runtime: no matching address range
stdout:
Hello
Hello
Hello
Hello
Hello
Hello
stderr:
Hello
Hello
Hello
Hello
Hello
Hello
--- Re-run this test:
python tools/test.py -n dartkp-linux-release-simarm standalone_2/io/stdout_stderr_non_blocking_test
Another timeout (log) that looks similar:
FAILED: app_jitk-vm release_x64 standalone_2/io/http_bind_test
Expected: Pass
Actual: Timeout
--- Command "app_jit" (took 01.000077s):
DART_CONFIGURATION=ReleaseX64 out/ReleaseX64/dart --snapshot=/b/s/w/ir/cache/builder/sdk/out/ReleaseX64/generated_compilations/app_jitk/tests_standalone_2_io_http_bind_test/out.jitsnapshot --snapshot-kind=app-jit --ignore-unrecognized-flags --packages=/b/s/w/ir/cache/builder/sdk/.packages /b/s/w/ir/cache/builder/sdk/tests/standalone_2/io/http_bind_test.dart
exit code:
0
stdout:
unittest-suite-wait-for-done
unittest-suite-success
--- Command "vm" (took 01:00.000380s):
DART_CONFIGURATION=ReleaseX64 out/ReleaseX64/dart --ignore-unrecognized-flags --packages=/b/s/w/ir/cache/builder/sdk/.packages /b/s/w/ir/cache/builder/sdk/out/ReleaseX64/generated_compilations/app_jitk/tests_standalone_2_io_http_bind_test/out.jitsnapshot
exit code:
-15
diagnostics:
Process list including children: [5328]
Trying to capture stack trace for pid 5328
PID 5328 - process
TID 5328:
#0 0x00007fe549543404 pthread_cond_wait@@GLIBC_2.3.2
#1 0x000056294a0cc158 dart::Monitor::WaitMicros(long)
#2 0x000056294a2d0f3a Dart_RunLoop
#3 0x0000562949db3afd dart::bin::RunMainIsolate(char const*, dart::bin::CommandLineOptions*)
#4 0x0000562949db4807 dart::bin::main(int, char**)
#5 0x0000562949db5119 main
#6 0x00007fe548c81f45 __libc_start_main
#7 0x0000562949db3029 _start
TID 5329:
#0 0x00007fe548d5e6d3 epoll_wait
#1 0x0000562949dbc095 dart::bin::EventHandlerImplementation::Poll(unsigned long)
#2 0x0000562949ddb6de dart::bin::ThreadStart(void*)
#3 0x00007fe54953f184 start_thread
#4 0x00007fe548d5e03d __clone
TID 5344:
#0 0x00007fe549543404 pthread_cond_wait@@GLIBC_2.3.2
#1 0x000056294a0cc158 dart::Monitor::WaitMicros(long)
#2 0x000056294a29ccaa dart::BackgroundCompiler::Run()
#3 0x000056294a1654f7 dart::ThreadPool::Worker::Loop()
#4 0x000056294a165383 dart::ThreadPool::Worker::Main(unsigned long)
#5 0x000056294a0cba49 dart::ThreadStart(void*)
#6 0x00007fe54953f184 start_thread
#7 0x00007fe548d5e03d __clone
stdout:
unittest-suite-wait-for-done
--- Re-run this test:
python tools/test.py -n app_jitk-linux-release-x64 standalone_2/io/http_bind_test
And another log from today for io/http_bind_test, marking it as flaky now.
Another failure log
From the flakiness dashboard:
name | configuration | outcomes | expected | current run | timeout | pass | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
standalone_2/io/http_bind_test | dartk-optcounter-linux-release-ia32 | Pass,Timeout | Pass | 64 | 1 | 64 | |||||
standalone_2/io/http_bind_test | dartkb-compile-linux-release-x64 | Pass,Timeout | Pass | 31 | 1 | 31 | |||||
standalone_2/io/http_bind_test | dartkp-linux-debug-x64 | Pass,Timeout | Pass | 54 | 1 | 54 | |||||
standalone_2/io/stdout_stderr_non_blocking_test | dartkp-linux-release-simarm | Pass,Timeout | Pass | 22 | 8 | 273 | |||||
standalone_2/io/stdout_stderr_non_blocking_test | dartkp-linux-release-x64 | Pass,Timeout | Pass | 86 | 1 | 86 |
standalone_2/io/http_bind_test
also failed in configuration dartk-linux-release-simarm log
Flakyness dashboard filtered on name=standalone_2/io
and outcomes=Pass,Timeout
:
Last New Result Seen | Name | Configuration | Outcomes | Current Status | Current Run | Pass | CTE | RTE | fail | crash | TO | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2019-02-08 01:23:35 | standalone_2/io/http_bind_test | dartk-linux-release-simarm | Pass,Timeout | Pass | 9 | 9 | 1 | |||||
2019-01-31 18:16:52 | standalone_2/io/http_read_test | dartkp-android-release-arm | Pass,Timeout | Pass | 99 | 99 | 1 | |||||
2019-02-04 16:00:26 | standalone_2/io/process_detached_test | dartk-asan-linux-release-x64 | Pass,Timeout | Pass | 62 | 62 | 1 | |||||
2019-02-08 01:48:47 | standalone_2/io/socket_connect_stream_data_close_cancel_test | dartk-mac-debug-x64 | Pass,Timeout | Pass | 24 | 27 | 1 | |||||
2019-02-07 11:27:56 | standalone_2/io/socket_connect_stream_data_close_cancel_test | dartkp-mac-release-simarm64 | Pass,Timeout | Pass | 64 | 67 | 1 | |||||
2019-02-04 10:49:49 | standalone_2/io/stdout_stderr_non_blocking_test | dartkp-no-bare-linux-release-simarm | Pass,Timeout | Pass | 4 | 83 | 2 | |||||
2019-01-30 18:15:13 | standalone_2/io/stdout_stderr_non_blocking_test | dartkp-linux-release-simarm | Pass,Timeout | Pass | 73 | 138 | 5 | |||||
2019-01-29 06:02:36 | standalone_2/io/stdout_stderr_non_blocking_test | dartkp-linux-release-simarm64 | Pass,Timeout | Pass | 99 | 99 | 1 | |||||
standalone_2/io/stdout_stderr_non_blocking_test | dartkp-bare-linux-release-simarm | Pass,Timeout | Pass | 22 | 199 | 5 | ||||||
standalone_2/io/stdout_stderr_non_blocking_test | dartkp-bare-linux-release-simarm64 | Pass,Timeout | Pass | 89 | 89 |
From the stack traces it seems that we are waiting for service isolate to shutdown - but in fact there are no service isolate running.