odin-lang / Odin

Odin Programming Language
https://odin-lang.org
BSD 3-Clause "New" or "Revised" License
6.77k stars 589 forks source link

Codegen error with AVX-512BW #4377

Open Barinzaya opened 5 days ago

Barinzaya commented 5 days ago

Context

    Odin:    dev-2024-10-nightly
    OS:      Arch Linux, Linux 6.11.2-zen1-1-zen
    CPU:     AMD Ryzen 9 9950X 16-Core Processor
    RAM:     61886 MiB
    Backend: LLVM 18.1.6

Expected Behavior

The code in the snippet below should compile without issues, and should execute without issues if AVX-512BW is available on the machine.

Current Behavior

When building the code in the snippet below (and other similarly-constructed code involving masks), an LLVM error (see below) is produced and the Odin compiler aborts. This only happens when relevant parts of the AVX-512 instruction set are enabled (in this case avx512bw), either via an attribute or via the command-line. When enabling other SIMD instruction sets (e.g. avx2), the code builds without issue.

In the sample code below, this also occurs when swapping main for a test procedure with the same body and attempting to run tests (odin test).

Failure Information (for bugs)

Example error:

LLVM ERROR: Cannot select: 0x740fd819bd00: v16i1 = setcc 0x740fd819b830, 0x740fd819c320, setgt:ch
  0x740fd819b830: v16i16 = sub 0x740fd819b6e0, 0x740fd819b7c0
    0x740fd819b6e0: v16i16,ch = load<(load (s256) from %ir.0 + 32, basealign 64)> 0x740fd819b590, 0x740fd819bfa0, undef:i64
      0x740fd819bfa0: i64 = add 0x740fd819b8a0, Constant:i64<32>
        0x740fd819b8a0: i64,ch = CopyFromReg 0x740fd8ae4360, Register:i64 %1
          0x740fd819b9f0: i64 = Register %1
        0x740fd819c080: i64 = Constant<32>
      0x740fd819c0f0: i64 = undef
    0x740fd819b7c0: v16i16,ch = load<(load (s256) from constant-pool)> 0x740fd8ae4360, 0x740fd819b520, undef:i64
      0x740fd819b520: i64 = X86ISD::Wrapper TargetConstantPool:i64<<16 x i16> <i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384>> 0
        0x740fd819bf30: i64 = TargetConstantPool<<16 x i16> <i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384>> 0
      0x740fd819c0f0: i64 = undef
  0x740fd819c320: v16i16 = bitcast 0x740fd819c160
    0x740fd819c160: v8i32 = BUILD_VECTOR Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>
      0x740fd819c240: i32 = Constant<-1>
      0x740fd819c240: i32 = Constant<-1>
      0x740fd819c240: i32 = Constant<-1>
      0x740fd819c240: i32 = Constant<-1>
      0x740fd819c240: i32 = Constant<-1>
      0x740fd819c240: i32 = Constant<-1>
      0x740fd819c240: i32 = Constant<-1>
      0x740fd819c240: i32 = Constant<-1>
In function: mre.foo
fish: Job 1, '~/Downloads/odin-linux-amd64-de…' terminated by signal SIGABRT (Abort)

Pointer values change with each build.

Steps to Reproduce

  1. Create an Odin source file mre.odin with the following code:
package mre

import "core:simd"

@(enable_target_feature = "avx512bw")
foo :: proc(src: #simd[32]u16, dst: ^[32]u16) {
    simd.masked_store(dst, src, simd.lanes_lt(src - auto_cast 16384, auto_cast 32768))
}

main :: proc() {
    a : [32]u16
    foo({}, &a)
}
  1. Attempt to build this file (odin build mre.odin -file).

This error also occurs if the enable_target_feature attribute is removed and the target feature is enabled via the command-line (-target-features:avx512bw). This error seems to be highly dependent on compiler flags; it does not occur if -o:size, -o:speed, or -o:aggressive are given, and also only seems to occur with some microarches (e.g. the default x86-64-v2 and x86-64-v3 fail, x86-64and x86-64-v4 work).

laytan commented 5 days ago

Looks like you also need avx512vl enabled to make codegen happy.

laytan commented 5 days ago

Could be this, which they say may be fixed in LLVM 19: https://github.com/llvm/llvm-project/issues/111380

Barinzaya commented 5 days ago

Looks like you also need avx512vl enabled to make codegen happy.

There are a lot of different ways to make the codegen happy, too. Setting optimization flags sometimes does it, changing the microarch sometimes does it (even if it's one that doesn't support AVX-512)... probably others too. This is extremely sensitive to compiler flags.

laytan commented 5 days ago

Optimization modes makes sense because it probably just removes the entire function because the program doesn't have any side effects, it is very hard to get llvm (with optimizations) to behave in a bug reproduction because it can just remove things.

And even if you get it to not remove your function it could be optimizing it to very different instructions.

The microarch affecting it is a little weird, especially if it doesn't enable the avx512 features.