android / ndk

The Android Native Development Kit
1.94k stars 254 forks source link

[Bug]: Android NDK r27 rc2 miscompiles code with inderect gotos #2040

Open SanjaLV opened 1 month ago

SanjaLV commented 1 month ago

Description

Android NDK r27 rc 2 produces invalid code for any target architecture when compiling with any nonzero optimization level.

Bellow is attached minimized/striped sample that shows the problem (when targeting x86_64 with -O1)

extern int printf(const char *fmt, ...);

int main() {
  void* bytecode[2];
  bytecode[0] = &&VM__OP_1;
  bytecode[1] = &&VM__TERMINATE;

  int state = 0;
  int index = 0;

  while (1) {
    switch (state) {
    case 0:
      goto *bytecode[index];
    case 1:
      // NOTE: THIS IS ONLY REACHABLE VIA INDIRECT GOTOS
      VM__OP_1:
      state = 2;
      break;
    case 2:
      printf("OP_1:(instruction=%d)\n", index);
      index++;
      goto *bytecode[index];
    }
  }

VM__TERMINATE:
  printf("TERMINATE:(instruction=%d)\n", index);
  return 0;
}

Link to github project: https://github.com/SanjaLV/ndk-bug-reports/tree/main/r27_rc2

Prerequisites:

  1. Linux/macOS machine
  2. ANDROID_HOME env variable that will point to Android SDK root.
  3. ndk;26.3.11579264 / ndk;27.0.11902837 installed with SDK manager.

How to reproduce (invalid code):

  1. Run make local and observe correct behavior with system compiler
  2. Run make r26 and observe correct behavior when compiling with NDK r26d
  3. Run make r27 and observe incorrect program behavior.
  4. Run optnone and observe correct behavior with O0 optimization level.

Correct execution should yield the following output:

OP_1:(instruction=0)
TERMINATE:(instruction=1)

Incorrect NDK r27 execution results in the following output:

TERMINATE:(instruction=0)

Context:

Originally discovered that upgrading NDK from r26d to r27 r1/rc2 broke state-machine like bytecode interpreter. After some investigation, we found out that bug appears if and only if we enable INDIRECT GOTO optimizations.

Feel free to ask for more information.

Many thanks, Aleksandrs

Upstream bug

No response

Commit to cherry-pick

No response

Affected versions

r27

Canary version

No response

Host OS

Linux

Host OS version

Ubuntu 22.04

Affected ABIs

armeabi-v7a, arm64-v8a, x86, x86_64

SanjaLV commented 1 month ago

miscompilation is obvious if you inspect generated llvm bitcode:

27.0.11902837/toolchains/llvm/prebuilt/linux-x86_64/bin/clang -O1 test.c -S -emit-llvm & cat test.ll

...

; Function Attrs: nofree nounwind uwtable
define dso_local i32 @main() local_unnamed_addr #0 {
  %1 = tail call i32 (ptr, ...) @printf(ptr noundef nonnull dereferenceable(1) @.str.1, i32 noundef 0)
  ret i32 0
}

...
appujee commented 1 month ago

Can't repro on clang upstream https://godbolt.org/z/d9bGnhhv6 so the fix is likely to be in upstream.

appujee commented 2 weeks ago

The bug is in SimplifyCFGPass

Before

define dso_local noundef i32 @main() #0 {
  %1 = alloca [2 x ptr], align 16
  store ptr blockaddress(@main, %7), ptr %1, align 16, !tbaa !5
  %2 = getelementptr inbounds [2 x ptr], ptr %1, i64 0, i64 1
  store ptr blockaddress(@main, %15), ptr %2, align 8, !tbaa !5
  br label %3

3:                                                ; preds = %0, %12
  %4 = phi i32 [ 0, %0 ], [ %13, %12 ]
  %5 = phi i32 [ 0, %0 ], [ %14, %12 ]
  switch i32 %4, label %12 [
    i32 0, label %6
    i32 1, label %7
    i32 2, label %9
  ]

6:                                                ; preds = %3
  br label %17

7:                                                ; preds = %3, %17
  %8 = phi i32 [ %18, %17 ], [ %5, %3 ]
  %8 = phi i32 [ %18, %17 ], [ %5, %3 ]
  br label %12

9:                                                ; preds = %3
  %10 = call i32 (ptr, ...) @printf(ptr noundef nonnull dereferenceable(1) @.str, i32 noundef %5)
  %11 = add nsw i32 %5, 1
  br label %17

12:                                               ; preds = %3, %7
  %13 = phi i32 [ %4, %3 ], [ 2, %7 ]
  %14 = phi i32 [ %5, %3 ], [ %8, %7 ]
  br label %3, !llvm.loop !9

15:                                               ; preds = %17
  %16 = call i32 (ptr, ...) @printf(ptr noundef nonnull dereferenceable(1) @.str.1, i32 noundef %
  ret i32 0

17:                                               ; preds = %9, %6
  %18 = phi i32 [ %11, %9 ], [ %5, %6 ]
  %19 = sext i32 %18 to i64
  %20 = getelementptr inbounds [2 x ptr], ptr %1, i64 0, i64 %19
  %21 = load ptr, ptr %20, align 8, !tbaa !5
  indirectbr ptr %21, [label %7, label %15]
}

After

; *** IR Dump After SimplifyCFGPass on main ***
; Function Attrs: mustprogress norecurse uwtable
define dso_local noundef i32 @main() #0 {
  %1 = alloca [2 x ptr], align 16
  store ptr blockaddress(@main, %6), ptr %1, align 16, !tbaa !5
  %2 = getelementptr inbounds [2 x ptr], ptr %1, i64 0, i64 1
  store ptr blockaddress(@main, %12), ptr %2, align 8, !tbaa !5
  br label %3

3:                                                ; preds = %0, %10
  %4 = phi i32 [ 0, %0 ], [ %11, %10 ]
  %5 = phi i32 [ 0, %0 ], [ %5, %10 ]
  switch i32 %4, label %10 [
    i32 0, label %12
    i32 1, label %6
    i32 2, label %7
  ]

6:                                                ; preds = %3
  br label %10

7:                                                ; preds = %3
  %8 = call i32 (ptr, ...) @printf(ptr noundef nonnull dereferenceable(1) @.str, i32 noundef %5)
  %9 = add nsw i32 %5, 1
  br label %12

10:                                               ; preds = %3, %6
  %11 = phi i32 [ %4, %3 ], [ 2, %6 ]
  br label %3, !llvm.loop !9
12:                                               ; preds = %7, %3
  %13 = phi i32 [ %9, %7 ], [ %5, %3 ]
  %14 = call i32 (ptr, ...) @printf(ptr noundef nonnull dereferenceable(1) @.str.1, i32 noundef %
  ret i32 0
}
appujee commented 1 week ago

https://r.android.com/3211575

DanAlbert commented 1 week ago

Do we need a separate fix for r28? That's currently on clang-r530567, and I think the plan was to keep on that one.

pirama-arumuga-nainar commented 1 week ago

The fix is already included in r28 (revision number for the fix is r523414).

DanAlbert commented 1 week ago

Great, thanks for confirming.

appujee commented 5 days ago

by the way, the patch i have uploaded 'hides' the issue. the real issue is still in llvm trunk. It could take a while to land a fix but I'm preparing a patch for review.

DanAlbert commented 5 days ago

I see. Hidden well enough that it's not something a user would encounter any more, or just way less common?

appujee commented 5 days ago

So the change in https://r.android.com/3211575 'updates the CFG' in a way that the later (buggy) pass can't see the problematic pattern anymore. Most likely the issue can't be reproduced in the new compiler but the current situation isn't ideal.

DanAlbert commented 4 days ago

Right. So, fixed as far as any NDK developer would be able to tell, but there's room for improvement upstream. I'll mark it fixed here once we merge the fix into the NDK then.

appujee commented 4 days ago

I believe https://github.com/llvm/llvm-project/commit/fc6bdb8549842613da51b9d570b29e27cc709f69 is the patch causing the bug. I'm preparing a fix but reverting this should also work.

appujee commented 3 days ago

https://r.android.com/3218272 reverts the above mentioned patch.