Closed mraleph closed 9 years ago
Similar failure, this time on debug_simarm, and in a simpler test (but lots of GC activity):
FAILED: none-vm-checked debug_simarm standalone/verified_mem_test Expected: Pass Actual: Crash CommandOutput[vm]:
stderr: Verifying before marking... done. Verifying before sweeping... done. Verifying before marking... done. Verifying before sweeping... done. Verifying before marking... done. Verifying before sweeping... done. runtime/vm/raw_object.cc:210: error: expected: (instance_size == SizeTag::decode(tags)) || (SizeTag::decode(tags) == 0)
Command[vm]: out/DebugSIMARM/dart --verified_mem --verify_before_gc --verify_after_gc --old_gen_growth_rate=1 --ignore-unrecognized-flags --enable_asserts --enable_type_checks --package-root=out/DebugSIMARM/packages/ /mnt/data/b/build/slave/vm-arm-sim-debug-be/build/dart/tests/standalone/verified_mem_test.dart Took 0:00:01.104000
Short reproduction command (experimental): python tools/test.py -asimarm --write-debug-log --write-test-outcome-log --copy-coredumps --exclude-suite pkg --checked -t480 standalone/verified_mem_test
cc @iposva-google. Changed the title to: "Heap corruption crashes".
Another one. Seems to affect all platforms and archs.
FAILED: none-vm-checked debug_ia32 language/disassemble_test Expected: Pass Actual: Crash CommandOutput[vm]:
stdout: Code for function 'dart:builtin::__getPrintClosure@221999692' { 01122700 bf79165303 mov edi,0x3531679 'Function '_getPrintClosure@221999692': static.' 01122705 ff472f inc [edi+0x2f] 01122708 817f2f28230000 cmp [edi+0x2f],0x2328 0112270F 0f8d6bf8ffff jnl 0x1121f80 [stub: OptimizeFunction] ... 03F713CC 58 pop eax 03F713CD 8945ec mov [ebp-0x14],eax 03F713D0 50 push eax ;; GuardFieldClass:14(_used@709387912 <not-nullable _Smi@915557746>, t1) 03F713D1 58 pop eax
stderr: E:\b\build\slave\vm-win32-debug-russian-be\build\dart\runtime\vm/class_table.h:132: error: expected: IsValidIndex(index)
Command[vm]: build\DebugIA32\dart.exe --disassemble --ignore-unrecognized-flags --enable_asserts --enable_type_checks --package-root=build/DebugIA32/packages/ E:\b\build\slave\vm-win32-debug-russian-be\build\dart\tests\language\disassemble_test.dart Took 0:00:14.337000
Short reproduction command (experimental): python tools/test.py --write-debug-log --write-test-outcome-log --copy-coredumps --exclude-suite pkg --checked -t120 language/disassemble_test
Just observed this on my Linux desktop when sync'ed at r43479:
$ python tools/test.py --report --time --mode=release --arch=simmips,simarm --compiler=none --runtime=vm --failure-summary --write- debug-log --write-test-outcome-log --copy-coredumps --exclude-suite=pkg --checked Test configurations: none_vm_release_simmips_checked none_vm_release_simarm_checked [00:19 | --% | + 1743 | - 0]Total: 31892 tests 7351 tests will be skipped (7002 skipped by design) 18 tests are expected to be flaky but not crash 2 tests are expected to flaky crash 24217 tests are expected to pass 60 tests are expected to fail that we won't fix 234 tests are expected to fail that we should fix 10 tests are expected to crash that we should fix 0 tests are allowed to timeout 0 tests are skipped on browsers due to compile-time error 0 could not be categorized or are in multiple categories
[05:42 | 91% | +22537 | - 0] FAILED: none-vm-checked release_simarm language/cast_test/10 Expected: Pass Actual: Crash Runtime error expected. CommandOutput[vm]:
Command[vm]: out/ReleaseSIMARM/dart --ignore-unrecognized-flags --enable_asserts --enable_type_checks --package-root=out/ReleaseSIMARM/packages/ /usr/local/google/home/koda/flake/dart/out/ReleaseSIMMIPS/generated_tests/language/cast_test_10.dart Took 0:00:00.246000
Short reproduction command (experimental): python tools/test.py -mrelease -asimarm --write-debug-log --write-test-outcome-log --copy-coredumps --exclude-suite pkg --checked -t240 language/cast_test/10
[07:12 | 100% | +24540 | - 1] === Failure summary:
FAILED: none-vm-checked release_simarm language/cast_test/10 Expected: Pass Actual: Crash Runtime error expected. CommandOutput[vm]:
Command[vm]: out/ReleaseSIMARM/dart --ignore-unrecognized-flags --enable_asserts --enable_type_checks --package-root=out/ReleaseSIMARM/packages/ /usr/local/google/home/koda/flake/dart/out/ReleaseSIMMIPS/generated_tests/language/cast_test_10.dart Took 0:00:00.246000
Short reproduction command (experimental): python tools/test.py -mrelease -asimarm --write-debug-log --write-test-outcome-log --copy-coredumps --exclude-suite pkg --checked -t240 language/cast_test/10
=== === 1 test failed ===
[07:12 | 100% | +24540 | - 1]
--- Total time: 07:12 --- 0:02:15.408000 - vm - none-vm-checked release_simmips/lib/convert/utf85_test 0:01:43.601000 - vm - none-vm-checked release_simarm/corelib/int_parse_radix_test/02 0:01:41.837000 - vm - none-vm-checked release_simarm/corelib/int_parse_radix_test/none 0:01:39.887000 - vm - none-vm-checked release_simarm/corelib/int_parse_radix_test/01 0:01:17.126000 - vm - none-vm-checked release_simarm/corelib/big_integer_parsed_mul_div_vm_test 0:01:14.725000 - vm - none-vm-checked release_simmips/corelib/int_parse_radix_test/02 0:01:13.083000 - vm - none-vm-checked release_simmips/corelib/int_parse_radix_test/01 0:01:12.995000 - vm - none-vm-checked release_simmips/corelib/int_parse_radix_test/none 0:01:07.266000 - vm - none-vm-checked release_simmips/lib/convert/chunked_conversion_utf88_test 0:01:01.411000 - vm - none-vm-checked release_simarm/co19/LibTest/core/Uri/encodeQueryComponent_A01_t02 0:00:59.537000 - vm - none-vm-checked release_simmips/corelib/big_integer_parsed_mul_div_vm_test 0:00:58.780000 - vm - none-vm-checked release_simarm/lib/convert/streamed_conversion_json_utf8_decode_test 0:00:57.485000 - vm - none-vm-checked release_simarm/lib/convert/streamed_conversion_json_utf8_decode_test 0:00:49.642000 - vm - none-vm-checked release_simarm/lib/mirrors/mirrors_reader_test 0:00:46.495000 - vm - none-vm-checked release_simarm/lib/convert/json_utf8_chunk_test 0:00:45.372000 - vm - none-vm-checked release_simmips/co19/LibTest/core/Uri/encodeQueryComponent_A01_t02 0:00:44.588000 - vm - none-vm-checked release_simmips/lib/convert/streamed_conversion_json_utf8_decode_test 0:00:43.513000 - vm - none-vm-checked release_simarm/corelib/collection_length_test 0:00:41.875000 - vm - none-vm-checked release_simmips/lib/convert/streamed_conversion_json_utf8_decode_test 0:00:38.932000 - vm - none-vm-checked release_simarm/isolate/mandel_isolate_test
I produced a core dump and am debugging it now. This particular crash is during compilation, but seems like a random corruption of new'ed memory (a Redirection object, in this case).
Core was generated by `out/ReleaseSIMMIPS/dart --ignore-unrecognized-flags --enable_asserts --enable_t'. Program terminated with signal SIGSEGV, Segmentation fault.
767 if (current->externalfunction == external_function) return current; (gdb) bt
at runtime/vm/simulator_mips.cc:817
at runtime/vm/runtime_entry_mips.cc:41
at runtime/vm/compiler.cc:696
at runtime/vm/compiler.cc:973
parameter1=parameter1@entry=-170916512, parameter2=parameter2@entry=-170916520, parameter3=parameter3@entry=0, fp_return=fp_return@entry=false, fp_args=fp_args@entry=false) at runtime/vm/simulator_mips.cc:2407
at runtime/vm/message_handler.cc:160
(gdb) t a a bt
Thread 5 (Thread 0xf6b73b40 (LWP 14785)):
Thread 4 (Thread 0xf69f1b40 (LWP 14803)):
Thread 3 (Thread 0xf7432b40 (LWP 14784)):
Thread 2 (Thread 0xf7434700 (LWP 14780)):
parameter1=parameter1@entry=150487004, parameter2=parameter2@entry=150486860, parameter3=parameter3@entry=0, fp_return=fp_return@entry=false, fp_args=fp_args@entry=false) at runtime/vm/simulator_mips.cc:2407
---Type <return> to continue, or q <return> to quit---
is_service_isolate=is_service_isolate@entry=false, package_root=package_root@entry=0xffa795ab "out/ReleaseSIMMIPS/packages/") at runtime/bin/dartutils.cc:659
is_service_isolate=is_service_isolate@entry=false, builtin_lib=builtin_lib@entry=0x8f83a28) at runtime/bin/dartutils.cc:734
script_uri=script_uri@entry=0xffa795c8 "/usr/local/google/home/koda/flake/dart/tests/co19/src/Language/12_Expressions/24_Shift_A01_t13.dart", main=main@entry=0x8a37a8a "main", package_root=0xffa795ab "out/ReleaseSIMMIPS/packages/", error=error@entry=0xffa78880, is_compile_error=is_compile_error@entry=0xffa78870) at runtime/bin/main.cc:604
Thread 1 (Thread 0xf653db40 (LWP 14826)):
at runtime/vm/simulator_mips.cc:817
at runtime/vm/runtime_entry_mips.cc:41
at runtime/vm/compiler.cc:696
at runtime/vm/compiler.cc:973
parameter1=parameter1@entry=-170916512, parameter2=parameter2@entry=-170916520, parameter3=parameter3@entry=0, fp_return=fp_return@entry=false, fp_args=fp_args@entry=false) at runtime/vm/simulator_mips.cc:2407
---Type <return> to continue, or q <return> to quit--- at runtime/vm/message_handler.cc:160
I now have several cases on DebugSIMARM/SIMMIPS where the Redirection linked list is corrupted, with a next_ pointer being either 0xabababab or 0x42424242, the pattern for uninitialized and deleted zone memory, respectively.
In all those cases, it's only the next_ pointer that is clobbered. And the pattern is surrounded by normal-looking values.
The Redirection class is unsynchronized and tries to do a lock-free linked list. Although there should not be any torn writes (we're on x86 and next_ is aligned), there might be other issues, such as reordering of stores by the compiler, which I will investigate.
BTW, in the days before the general flakiness was first observed on the build bots, there were several Zone-related changes which may be relevant: https://codereview.chromium.org/851513002/ https://codereview.chromium.org/832713006/ https://codereview.chromium.org/855533002/
The compiler reorders the stores, so the static head "list" is updated before the "next" pointer of the new element, as seen in the disassembly of the (inlined) constructor:
(gdb) print &list $19 = (dart::Redirection **) 0x8d40b60 <dart::Redirection::list> (gdb) set disassembly-flavor intel (gdb) disas dart::Simulator::RedirectExternalReference Dump of assembler code for function dart::Simulator::RedirectExternalReference(unsigned int, dart::Simulator::CallKind, int): ... 0x084e8a81 <+81>: mov edx,DWORD PTR ds:0x8d40b60 0x084e8a87 <+87>: mov ds:0x8d40b60,eax 0x084e8a8c <+92>: add eax,0xc 0x084e8a8f <+95>: mov DWORD PTR [eax+0x4],edx ...
If the thread is interrupted between these two stores, it could lead to the corruption observed.
Since Redirection is only used on simulator builds, this raises the possibility that this class of crashes is separate from those seen on non-simulator builds.
The simulator failures should be fixed by https://codereview.chromium.org/898123002/
But that does not explain the non-simulator flaky crashes, so I'll keep this issue open to track those. Here's another recent one:
... Stackmaps for function 'dart:convert_::_getLATIN1' { } Variable Descriptors for function 'dart:convert::_getLATIN1' { saved current CTX reg offset -3 } Exception Handlers for function 'dart:convert::_get_LATIN1' { No exception handlers } Static call target functions { } Code for function 'dart:core_Uri__uriDecode@915557746' {
stderr: e:\b\build\slave\vm-win32-debug-be\build\dart\runtime\vm/class_table.h:132: error: expected: IsValidIndex(index)
Command[vm]: build\DebugIA32\dart.exe --disassemble --ignore-unrecognized-flags --enable_asserts --enable_type_checks --package-root=build/DebugIA32/packages/ e:\b\build\slave\vm-win32-debug-be\build\dart\tests\language\disassemble_test.dart Took 0:00:19.688000
Short reproduction command (experimental): python tools/test.py --write-debug-log --write-test-outcome-log --copy-coredumps --exclude-suite pkg --checked -t120 language/disassemble_test
=== === 1 test failed ===
Set owner to @kodandersson.
Another flaky crash, but this time in dart2js.
@@@BUILD_STEP dart2js-jsshell tests --dart2js-batch failures@@@
FAILED: dart2js-jsshell release_ia32 lib/math/point_test Expected: Pass Actual: Crash Unexpected compile-time error. CommandOutput[dart2js]:
Command[dart2js]: out/ReleaseIA32/dart-sdk/bin/dart2js --allow-mock-compilation --categories=all --package-root=out/ReleaseIA32/packages/ /mnt/data/b/build/slave/dart2js-linux-jsshell-release-2-4-be/build/dart/tests/lib/math/point_test.dart --out=/mnt/data/b/build/slave/dart2js-linux-jsshell-release-2-4-be/build/dart/out/ReleaseIA32/generated_compilations/dart2js-sdk/tests_lib_math_point_test/out.js Took 0:00:04.761000
Command[jsshell]: /mnt/data/b/build/slave/dart2js-linux-jsshell-release-2-4-be/build/dart/tools/testing/bin/jsshell -f out/ReleaseIA32/dart-sdk/lib/_internal/compiler/js_lib/preambles/jsshell.js -f /mnt/data/b/build/slave/dart2js-linux-jsshell-release-2-4-be/build/dart/out/ReleaseIA32/generated_compilations/dart2js-sdk/tests_lib_math_point_test/out.js Did not run
Short reproduction command (experimental): python tools/test.py -mrelease -cdart2js -rjsshell --use-sdk --write-debug-log --write-test-outcome-log --clear_browser_cache --dart2js-batch -t60 lib/math/point_test
=== === 1 test failed ===
With new, more detailed assertion failure:
FAILED: none-vm-checked debug_simarm lib/convert/streamed_conversion_json_utf8_decode_test Expected: Pass Actual: Crash CommandOutput[vm]:
stderr: Verifying before marking... done. Verifying before sweeping...Verifying before marking... done. done. Verifying before sweeping... done. Verifying before marking... done. Verifying before sweeping... done. Verifying before marking... done. Verifying before sweeping... done. Verifying before marking... done. Verifying before sweeping... done. Verifying before marking... done. Verifying before sweeping... done. runtime/vm/raw_object.cc:221: error: Size mismatch: 32 from class vs 24 from tags 3e0300
Command[vm]: DART_CONFIGURATION=DebugSIMARM out/DebugSIMARM/dart --verified_mem --verify_before_gc --verify_after_gc --old_gen_growth_rate=1 --ignore-unrecognized-flags --enable_asserts --enable_type_checks --package-root=out/DebugSIMARM/packages/ /mnt/data/b/build/slave/vm-arm-sim-debug-be/build/dart/tests/lib/convert/streamed_conversion_json_utf8_decode_test.dart Took 0:00:01.457000
Short reproduction command (experimental): python tools/test.py -asimarm --write-debug-log --write-test-outcome-log --copy-coredumps --exclude-suite pkg --checked -t480 lib/convert/streamed_conversion_json_utf8_decode_test
=== === 1 test failed ===
Size assertion failures are probably caused by a debug-only race. Fix/workaround is under review here: https://codereview.chromium.org/936393003/
Closing this as too broad.
Multiple independent crash issues have been fixed, although there are still occasional crashes, also in release mode (in particular, also the VM driving the test harness).
If/when we have any details/pattern, and ideally a core dump/repro, we should file a new issue.
Added TooBroad label.
Can't repro locally but might be worth investigating (GC problem?)
FAILED: none-vm-checked debug_x64 standalone/io/http_proxy_test Expected: Pass Actual: Crash CommandOutput[vm]:
stderr: /Volumes/data/b/build/slave/vm-mac-debug-x64-be/build/dart/runtime/vm/raw_object.cc:210: error: expected: (instance_size == SizeTag::decode(tags)) || (SizeTag::decode(tags) == 0)
Command[vm]: xcodebuild/DebugX64/dart --ignore-unrecognized-flags --enable_asserts --enable_type_checks --package-root=xcodebuild/DebugX64/packages/ /Volumes/data/b/build/slave/vm-mac-debug-x64-be/build/dart/tests/standalone/io/http_proxy_test.dart Took 0:00:07.193000
Short reproduction command (experimental): python tools/test.py -ax64 --write-debug-log --write-test-outcome-log --copy-coredumps --exclude-suite pkg --checked -t120 standalone/io/http_proxy_test