Open DavidSpickett opened 3 years ago
That's a bit of a fuzzy area. I think ideally we'd keep all tests passing with the legacy PM, but it's a bit tricky since we don't have buildbots.
But in any case, all tests should be fine with values being renamed. As mentioned, update_test_checks.py already takes care of this. If a test doesn't use that, it should use regex (like update_test_checks.py does).
Ok so we're fine with some tests failing with the legacy PM then?
Here's what I get:
Failed Tests (5): Clang :: CodeGenCUDA/host-used-device-var.cu Clang :: CodeGenCUDA/unused-global-var.cu Clang :: OpenMP/remarks_parallel_in_multiple_target_state_machines.c Clang :: OpenMP/remarks_parallel_in_target_state_machine.c LLVM :: Transforms/SimpleLoopUnswitch/implicit-null-checks.ll
(bf16 test had the flag added)
Hardly a big issue, no bots use the legacy PM and I had a stale cmake cache that's why I spotted it.
Instruction names are never meant to be stable, just helpful for debugging. Passes can change the names of the instructions they generate however they please.
Things like update_test_checks.py will take this into account.
...and if it is a bug/inefficiency in the legacy PM are we ok to stick with disabling it for these tests?
Extended Description
This is following up on https://reviews.llvm.org/rGd69c4372bfbe.
I had a local build that had cmake ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=OFF and
arm-bf16-convert-intrinsics.c failed for me once the test output was updated.
Disabling the legacy pass manager with -fno-legacy-pass-manager was suggested and that fixed the issue.
Looking more closely, only one of the run lines in that file was failing: // RUN: %clang_cc1 \ // RUN: -triple armv8.6a-arm-none-eabi -target-feature +neon \ // RUN: -target-feature +bf16 -mfloat-abi hard \ // RUN: -disable-O0-optnone -emit-llvm -fno-legacy-pass-manager -o - %s \ // RUN: | opt -S -mem2reg \ // RUN: | FileCheck --check-prefixes=CHECK,CHECK-A32-HARDFP %s
By taking just that run and reducing the input down to just:
include
float32x4_t test_vcvtq_low_f32_bf16(bfloat16x8_t a) { return vcvtq_low_f32_bf16(a); }
I saw that the order of the intial alloca, and the names assigned to them was different. (the name change probably effects the output order somehow)
E.g. Legacy PM: %__p0_150.i.i = alloca <4 x bfloat>, align 8 New PM: %__p0_150.i = alloca <4 x bfloat>, align 8
With the two on different lines.
You can reproduce with the attached C file and the following command: $ ./bin/clang -cc1 -internal-isystem ./lib/clang/13.0.0/include -nostdsysteminc -triple armv8.6a-arm-none-eabi -target-feature +neon -target-feature +bf16 -mfloat-abi softfp -disable-O0-optnone -emit-llvm -o - -fno-legacy-pass-manager /tmp/test_pm.c | ./bin/opt -S -mem2reg
Is this expected or indicative or something that should be the same across both?