llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.66k stars 11.85k forks source link

Difference in alloca naming/ordering between legacy and new pass manager #50003

Open DavidSpickett opened 3 years ago

DavidSpickett commented 3 years ago
Bugzilla Link 50659
Version trunk
OS Linux
Attachments test file
CC @aeubanks,@dwblaikie,@zygoloid

Extended Description

This is following up on https://reviews.llvm.org/rGd69c4372bfbe.

I had a local build that had cmake ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=OFF and
arm-bf16-convert-intrinsics.c failed for me once the test output was updated.

Disabling the legacy pass manager with -fno-legacy-pass-manager was suggested and that fixed the issue.

Looking more closely, only one of the run lines in that file was failing: // RUN: %clang_cc1 \ // RUN: -triple armv8.6a-arm-none-eabi -target-feature +neon \ // RUN: -target-feature +bf16 -mfloat-abi hard \ // RUN: -disable-O0-optnone -emit-llvm -fno-legacy-pass-manager -o - %s \ // RUN: | opt -S -mem2reg \ // RUN: | FileCheck --check-prefixes=CHECK,CHECK-A32-HARDFP %s

By taking just that run and reducing the input down to just:

include

float32x4_t test_vcvtq_low_f32_bf16(bfloat16x8_t a) { return vcvtq_low_f32_bf16(a); }

I saw that the order of the intial alloca, and the names assigned to them was different. (the name change probably effects the output order somehow)

E.g. Legacy PM: %__p0_150.i.i = alloca <4 x bfloat>, align 8 New PM: %__p0_150.i = alloca <4 x bfloat>, align 8

With the two on different lines.

You can reproduce with the attached C file and the following command: $ ./bin/clang -cc1 -internal-isystem ./lib/clang/13.0.0/include -nostdsysteminc -triple armv8.6a-arm-none-eabi -target-feature +neon -target-feature +bf16 -mfloat-abi softfp -disable-O0-optnone -emit-llvm -o - -fno-legacy-pass-manager /tmp/test_pm.c | ./bin/opt -S -mem2reg

Is this expected or indicative or something that should be the same across both?

aeubanks commented 3 years ago

That's a bit of a fuzzy area. I think ideally we'd keep all tests passing with the legacy PM, but it's a bit tricky since we don't have buildbots.

But in any case, all tests should be fine with values being renamed. As mentioned, update_test_checks.py already takes care of this. If a test doesn't use that, it should use regex (like update_test_checks.py does).

DavidSpickett commented 3 years ago

Ok so we're fine with some tests failing with the legacy PM then?

Here's what I get:


Failed Tests (5): Clang :: CodeGenCUDA/host-used-device-var.cu Clang :: CodeGenCUDA/unused-global-var.cu Clang :: OpenMP/remarks_parallel_in_multiple_target_state_machines.c Clang :: OpenMP/remarks_parallel_in_target_state_machine.c LLVM :: Transforms/SimpleLoopUnswitch/implicit-null-checks.ll

(bf16 test had the flag added)

Hardly a big issue, no bots use the legacy PM and I had a stale cmake cache that's why I spotted it.

aeubanks commented 3 years ago

Instruction names are never meant to be stable, just helpful for debugging. Passes can change the names of the instructions they generate however they please.

Things like update_test_checks.py will take this into account.

DavidSpickett commented 3 years ago

...and if it is a bug/inefficiency in the legacy PM are we ok to stick with disabling it for these tests?