dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.22k stars 1.57k forks source link

dartfuzz crash: CSE bypasses bounds check #53574

Open aam opened 1 year ago

aam commented 1 year ago

log

swarming_bot_logs: 2023-09-20 04:50:47.269: run_command(['/b/s/w/ir/out/ReleaseX64/dart', 'runtime/tools/dartfuzz/dartfuzz_test.dart', '--isolates', '8', '--no-show-stats', '--time', '2700', '--shards=100', '--shard=26', '--output-directory=/b/s/w/io4mbv5g1q'], /b/s/w/ir, 3600.0, 30.0, False, Containment<NONE, 0, 0>)
...

Isolate (/b/s/w/it4w3bxri4/dart_fuzzIFDCWT) NO-FP NO-FFI FLAT : JIT-O3-ReleaseSIMARM64C - AOT-DebugX64: !DIVERGENCE! 1.100:2709314189 (0 vs -6)

fail2:
-6
foo2() throws
var80.foo0_Extension0() throws
(var8).foo1_Extension0() throws
var128.foo2_Extension1() throws
var62.foo3_Extension0() throws

===== CRASH =====
si_signo=Segmentation fault(11), si_code=128, si_addr=(nil)
version=3.2.0-edge.88b07ba64e07fb0eef9c777769917c65fdf1832e (be) (Tue Sep 19 21:57:39 2023 +0000) on "linux_x64"
pid=8809, thread=8817, isolate_group=main(0x557569133060), isolate=main(0x5575691414c0)
os=linux, arch=x64, comp=no, sim=no
isolate_instructions=7f6aa4ef9780, vm_instructions=7f6aa4ef2000
fp=7f6a9fcfe0c8, sp=7f6a9fcfe018, pc=7f6aa4fdc561
  pc 0x00007f6aa4fdc561 fp 0x00007f6a9fcfe0c8 _kDartIsolateSnapshotInstructions+0xe2de1
  pc 0x00007f6aa4fd1289 fp 0x00007f6a9fcfe170 _kDartIsolateSnapshotInstructions+0xd7b09
  pc 0x00007f6aa4fdeb6f fp 0x00007f6a9fcfe180 _kDartIsolateSnapshotInstructions+0xe53ef
  pc 0x00007f6aa4fe28e6 fp 0x00007f6a9fcfe1f8 _kDartIsolateSnapshotInstructions+0xe9166
  pc 0x00007f6aa4f2e033 fp 0x00007f6a9fcfe230 _kDartIsolateSnapshotInstructions+0x348b3
  pc 0x00007f6aa4fe2c04 fp 0x00007f6a9fcfe2b0 _kDartIsolateSnapshotInstructions+0xe9484
  pc 0x00007f6aa4f62565 fp 0x00007f6a9fcfe2d8 _kDartIsolateSnapshotInstructions+0x68de5
  pc 0x00007f6aa4ef6a67 fp 0x00007f6a9fcfe350 _kDartVmSnapshotInstructions+0x4a67
  pc 0x000055756784748e fp 0x00007f6a9fcfe3c0 dart::DartEntry::InvokeFunction+0xce
  pc 0x000055756784a055 fp 0x00007f6a9fcfe420 dart::DartLibraryCalls::HandleMessage+0xf5
  pc 0x000055756786f0ad fp 0x00007f6a9fcfebb0 dart::IsolateMessageHandler::HandleMessage+0x37d
  pc 0x00005575678814e5 fp 0x00007f6a9fcfec40 dart::MessageHandler::HandleMessages+0x1d5
  pc 0x0000557567882187 fp 0x00007f6a9fcfecc0 dart::MessageHandler::TaskCallback+0x307
  pc 0x0000557567a20170 fp 0x00007f6a9fcfed50 dart::ThreadPool::WorkerLoop+0x180
  pc 0x0000557567a20a9b fp 0x00007f6a9fcfeda0 dart::ThreadPool::Worker::Main+0x12b
  pc 0x0000557567988c13 fp 0x00007f6a9fcfeef0 dart::ThreadStart+0x103
-- End of DumpStackTrace
rmacnak-google commented 1 year ago

Reduced to

mport "dart:collection";
import "dart:typed_data";

RuneIterator var76 = RuneIterator.at("va", 28);

@pragma("vm:never-inline")
Uint16List hide() => Uint16List(1);

void foo3_Extension1(List<bool> par2) {
  try {
    for (int loc0 in hide()) {
      par2[-9223372032559808514]!;
    }
  } catch (e, st) {}
}

void foo4_Extension1() {
  if (var76.movePrevious()) {
    for (int loc0 in Uint16List(15)) {
    }
  }
}

main() {
  foo3_Extension1(<bool>[false, false, false, false, false]);
  try {
    foo4_Extension1();
  } catch (e, st) {}
}

Which for some reason incorrectly eliminates the bounds check for par2.

        ;; v33 <- LoadField:42(v16 T{_GrowableList} . GrowableObjectArray.data) T{_List}
0x7feb17819539    4c8b4f17               movq r9,[rdi+0x17]
        ;; ParallelMove r10 <- C
0x7feb1781953d    4d8b971f4c0000         movq r10,[pp+0x4c1f]   -9223372032559808514
        ;; v34 <- LoadIndexed:42(v33, v144 T{_Mint}) T{bool}
0x7feb17819544    4f8b649117             movq r12,[r9+r10*4+0x17]     ***SEGV
alexmarkov commented 1 year ago

@rmacnak-google created a CL to fix this (https://dart-review.googlesource.com/c/sdk/+/327282), but I'd like to discuss the problem and possible solutions in more details.

This bug shows the weakness of representing control dependencies as data dependencies.

In the reduced repro, CSE replaces BoxInt64 with another BoxInt64, bypassing intermediate GenericCheckBound. As a result, this breaks dependency between LoadIndexed and GenericCheckBound and enables subsequent LICM of LoadIndexed instruction above GenericCheckBound:

    v129 <- BoxInt64(v60)
    CheckNull:30(v129 T{int}, ArgumentError) T{int}
    v118 <- GenericCheckBound:30(v130 T{_Smi}, v60) T{int}
    v131 <- BoxInt64(v118)
    v119 <- LoadIndexed(v17, v131 T{int}) T{_Smi}

CSE =>

    v125 <- BoxInt64(v122) T{int}
    CheckNull:38(v125 T{_Mint}, ArgumentError) T{_Mint}
    v32 <- GenericCheckBound:38(v132 T{_Smi}, v122) T{_Mint}
    v34 <- LoadIndexed(v33, v125 T{_Mint}) T{bool}

Currently CSE checks inputs in the following way when comparing two candidate instructions for CSE:

The problem is that CSE could occur for any redefinition-like instruction such as box/unbox which participates in the dependency chain between null/bound/type check and a dependent load (LoadField/LoadIndexed).

One way to fix this would be to drop OriginalDefinition() entirely and only do CSE if inputs are the same for all instructions. This would let us keep control-as-data dependencies, but could be too conservative and regress performance. Can we measure this option on Golem to see how much it affects performance?

We can also have a list of instructions which may participate in the control-as-data dependency chains, and drop OriginalDefinition() from CSE only for those instructions. They would probably include at least LoadField, LoadIndexed, all boxing and unboxing instructions, and maybe redefinitions. This would more more efficient but more fragile and I'm not sure it would cover all possible current and future cases where CSE would break control-as-data dependencies.

Another option is to introduce separate explicit control dependencies between instructions and take them into account in LICM. This way we can freely do CSE while still keeping dependencies between checks (such as GenericCheckBound) and dependent loads (such as LoadIndexed). In addition, we might be able to replace certain Redefinition instructions with more explicit control dependencies. The downside is that this solution is much more involved and would make our IL more complex and probably harder to work with.

Any other ideas?

@mraleph @mkustermann WDYT?

rmacnak-google commented 1 year ago

CSE comparing inputs without OriginalDefinition:

https://dart-review.googlesource.com/c/sdk/+/327780 <golem>/Revision?repository=dart&revision=106249&patch=18546