rust-lang / packed_simd

Portable Packed SIMD Vectors for Rust standard library
https://rust-lang.github.io/packed_simd/packed_simd_2/
Apache License 2.0
590 stars 74 forks source link

Enable MSA on MIPS64 CI #18

Closed gnzlbg closed 1 year ago

gnzlbg commented 6 years ago

Enabling MSA for the mips64 targets produces the following error:

export RUSTFLAGS= -C target-feature=+msa -C target-cpu=mips64r5
cargo test --target=mips64-unknown-linux-gnuabi64 --release
   Compiling cfg-if v0.1.4
   Compiling nodrop v0.1.12
   Compiling packed_simd v0.1.0 (file:///checkout)
   Compiling arrayvec v0.4.7
LLVM ERROR: Cannot select: 0x7fdaf4d96b60: v2i64 = setcc 0x7fdaf47759c0, 0x7fdaf4d960d0, setlt:ch
  0x7fdaf47759c0: v2f64 = vselect 0x7fdaf4775138, 0x7fdaf479dd00, 0x7fdaf479dc98
    0x7fdaf4775138: v2i64 = setcc 0x7fdaf479dd00, 0x7fdaf479dc98, setlt:ch
      0x7fdaf479dd00: v2f64,ch = load<(load 16 from %stack.29, align 32)> 0x7fdaf46ce6e8, FrameIndex:i64<29>, undef:i64
        0x7fdaf4775000: i64 = FrameIndex<29>
        0x7fdaf479d680: i64 = undef
      0x7fdaf479dc98: v2f64,ch = load<(load 16 from %stack.29 + 16)> 0x7fdaf46ce6e8, 0x7fdaf4775618, undef:i64
        0x7fdaf4775618: i64 = or FrameIndex:i64<29>, Constant:i64<16>
          0x7fdaf4775000: i64 = FrameIndex<29>
          0x7fdaf4d96e38: i64 = Constant<16>
        0x7fdaf479d680: i64 = undef
    0x7fdaf479dd00: v2f64,ch = load<(load 16 from %stack.29, align 32)> 0x7fdaf46ce6e8, FrameIndex:i64<29>, undef:i64
      0x7fdaf4775000: i64 = FrameIndex<29>
      0x7fdaf479d680: i64 = undef
    0x7fdaf479dc98: v2f64,ch = load<(load 16 from %stack.29 + 16)> 0x7fdaf46ce6e8, 0x7fdaf4775618, undef:i64
      0x7fdaf4775618: i64 = or FrameIndex:i64<29>, Constant:i64<16>
        0x7fdaf4775000: i64 = FrameIndex<29>
        0x7fdaf4d96e38: i64 = Constant<16>
      0x7fdaf479d680: i64 = undef
  0x7fdaf4d960d0: v2f64,ch = load<(load 16 from %stack.30)> 0x7fdaf4d96138, FrameIndex:i64<30>, undef:i64
    0x7fdaf46ceaf8: i64 = FrameIndex<30>
    0x7fdaf479d680: i64 = undef
In function: _ZN4core3ops8function6FnOnce9call_once17h22e71f119a5a9c96E
error: Could not compile `packed_simd`.

cc @jcowgill

jcowgill commented 6 years ago

Clearly an LLVM bug of some kind. Unfortunately I don't work on MIPS anymore and I have no time to investigate it. I'll try and contact someone who may be able to help...

draganmladjenovic commented 6 years ago

Hi @gnzlbg . One question. Which version of rust were you using?

gnzlbg commented 6 years ago

@draganmladjenovic rust nightly

draganmladjenovic commented 6 years ago

This boils down to reduce.fmin/fmax being broken on Mips since https://reviews.llvm.org/D46854 https://godbolt.org/g/fBkcai. Even before that reduce.fmin gets expanded to sequence that uses fcmp fast olt irrespective of fast flag being present during the intrinsic call or not (Might be a bug by itself). The second problem is that since D46854 fcmp fast olt gets treated as fcmp lt which Mips backend can't select on floating point vector types. I don't know right now what would be a proper solution to this problem. If nothing I will submit the llvm bug report shortly. @gnzlbg In the meantime, is it ok to work around this in this crate by expanding the reduction manually instead of using intrinsic?

gnzlbg commented 6 years ago

@draganmladjenovic

In the meantime, is it ok to work around this in this crate by expanding the reduction manually instead of using intrinsic?

Yes, that is ok (we already do this for other targets so it should only be a matter of adding some more cfg() macros here and there), I'll try to send a PR that fixes the CI builds.

draganmladjenovic commented 6 years ago

@gnzlbg

I'll try to send a PR that fixes the CI builds.

Ok, thank you for that . I will submit the llvm bug report shortly so you can reference it.

gnzlbg commented 6 years ago

I will submit the llvm bug report shortly so you can reference it.

Thanks, just post that here, and I'll recycle/rename this issue to keep on tracking that LLVM bug.

draganmladjenovic commented 6 years ago

Bug submitted at https://bugs.llvm.org/show_bug.cgi?id=38287.

gnzlbg commented 6 years ago

Thanks! So I think I've made some progress in that things now compile, but they fail to link when using mips64r6: https://travis-ci.org/rust-lang-nursery/packed_simd/jobs/407564460#L1318

error: linking with `mips64-linux-gnuabi64-gcc` failed: exit code: 1
  |
  = note: "mips64-linux-gnuabi64-gcc" "-Wl,--as-needed" "-Wl,-z,noexecstack" "-L" "/rust/lib/rustlib/mips64-unknown-linux-gnuabi64/lib" "/checkout/target/mips64-unknown-linux-gnuabi64/release/deps/packed_simd-bc52a852fd902da6.packed_simd0.rcgu.o" "-o" "/checkout/target/mips64-unknown-linux-gnuabi64/release/deps/packed_simd-bc52a852fd902da6" "/checkout/target/mips64-unknown-linux-gnuabi64/release/deps/packed_simd-bc52a852fd902da6.crate.allocator.rcgu.o" "-Wl,--gc-sections" "-pie" "-Wl,-z,relro,-z,now" "-Wl,-O1" "-nodefaultlibs" "-L" "/checkout/target/mips64-unknown-linux-gnuabi64/release/deps" "-L" "/checkout/target/release/deps" "-L" "/rust/lib/rustlib/mips64-unknown-linux-gnuabi64/lib" "-Wl,-Bstatic" "/rust/lib/rustlib/mips64-unknown-linux-gnuabi64/lib/libtest-cd2e845f4c70927d.rlib" "/rust/lib/rustlib/mips64-unknown-linux-gnuabi64/lib/libterm-9eb611101cf4cecf.rlib" "/rust/lib/rustlib/mips64-unknown-linux-gnuabi64/lib/libgetopts-4874943d5732d1b7.rlib" "-Wl,--start-group" "/rust/lib/rustlib/mips64-unknown-linux-gnuabi64/lib/libstd-3f6ab34df8ffd83e.rlib" "/rust/lib/rustlib/mips64-unknown-linux-gnuabi64/lib/libpanic_unwind-a5a7585073f51dde.rlib" "/rust/lib/rustlib/mips64-unknown-linux-gnuabi64/lib/libunwind-003f74338435e0fd.rlib" "/rust/lib/rustlib/mips64-unknown-linux-gnuabi64/lib/liballoc_system-2755128d19d935f8.rlib" "/rust/lib/rustlib/mips64-unknown-linux-gnuabi64/lib/liblibc-b75d74899d2adc7f.rlib" "/rust/lib/rustlib/mips64-unknown-linux-gnuabi64/lib/liballoc-8b879cd22eb4890a.rlib" "/checkout/target/mips64-unknown-linux-gnuabi64/release/deps/libarrayvec-eb8868cfdf72a318.rlib" "/checkout/target/mips64-unknown-linux-gnuabi64/release/deps/libnodrop-f44f02a725192a9d.rlib" "/checkout/target/mips64-unknown-linux-gnuabi64/release/deps/libcfg_if-f697f12200a30b1b.rlib" "/rust/lib/rustlib/mips64-unknown-linux-gnuabi64/lib/libcore-f30fde47d896be92.rlib" "-Wl,--end-group" "/rust/lib/rustlib/mips64-unknown-linux-gnuabi64/lib/libcompiler_builtins-ddc5be726ac41261.rlib" "-Wl,-Bdynamic" "-l" "dl" "-l" "rt" "-l" "pthread" "-l" "gcc_s" "-l" "c" "-l" "m" "-l" "rt" "-l" "pthread" "-l" "util"
  = note: /usr/lib/gcc-cross/mips64-linux-gnuabi64/7/../../../../mips64-linux-gnuabi64/bin/ld: /checkout/target/mips64-unknown-linux-gnuabi64/release/deps/packed_simd-bc52a852fd902da6.packed_simd0.rcgu.o: linking mips:isa64r6 module with previous mips:isa64r2 modules
          /usr/lib/gcc-cross/mips64-linux-gnuabi64/7/../../../../mips64-linux-gnuabi64/bin/ld: /checkout/target/mips64-unknown-linux-gnuabi64/release/deps/packed_simd-bc52a852fd902da6.packed_simd0.rcgu.o: linking -mnan=2008 module with previous -mnan=legacy modules
          /usr/lib/gcc-cross/mips64-linux-gnuabi64/7/../../../../mips64-linux-gnuabi64/bin/ld: failed to merge target specific data of file /checkout/target/mips64-unknown-linux-gnuabi64/release/deps/packed_simd-bc52a852fd902da6.packed_simd0.rcgu.o
          /usr/lib/gcc-cross/mips64-linux-gnuabi64/7/../../../../mips64-linux-gnuabi64/bin/ld: /checkout/target/mips64-unknown-linux-gnuabi64/release/deps/packed_simd-bc52a852fd902da6.crate.allocator.rcgu.o: linking mips:isa64r6 module with previous mips:isa64r2 modules
          /usr/lib/gcc-cross/mips64-linux-gnuabi64/7/../../../../mips64-linux-gnuabi64/bin/ld: /checkout/target/mips64-unknown-linux-gnuabi64/release/deps/packed_simd-bc52a852fd902da6.crate.allocator.rcgu.o: linking -mnan=2008 module with previous -mnan=legacy modules
          /usr/lib/gcc-cross/mips64-linux-gnuabi64/7/../../../../mips64-linux-gnuabi64/bin/ld: failed to merge target specific data of file /checkout/target/mips64-unknown-linux-gnuabi64/release/deps/packed_simd-bc52a852fd902da6.crate.allocator.rcgu.o
          collect2: error: ld returned 1 exit status

error: aborting due to previous error
gnzlbg commented 6 years ago

@draganmladjenovic one thing I forgot mentioned is that as #41 shows, the LLVM error is only triggered if MSA is activated. Without MSA, the experimental reduction min/max intrinsics are lowered correctly.

draganmladjenovic commented 6 years ago

Huh this is messy. I'm guessing that you are targeting mips64r6 because qemu-mips64 supports MSA only for I6400. The problem is that mips64r6 is not compatible with previous ISA iterations an cannot be linked together. I guess you would need the something akin to mips64dsp2 target, named mips64msa or something. The nan2008 error is because mips64r6 is nan2008 by default. Speaking of nan2008, if remember correctly the qemu-mips for cpu P5600 won't accept non nan2008 binaries, so probably there will be issues with running mips32r5 MSA. I have a vague memory that kernel will allow legacy nan binaries to be run on P5600 under some compatibility option, so mips32r2, MSA binary should be a valid usecase.

Not sure what to do here. I guess that I could fork the qemu and provide changes needed. Is using prebuilt binaries a option. It will take some time before this changes end up in debian qemu package.

Without MSA, the experimental reduction min/max intrinsics are lowered correctly Yes it gets lowered to scalar instructions. I will update the llvm issue title to reflect that.

gnzlbg commented 6 years ago

I'm guessing that you are targeting mips64r6 because qemu-mips64 supports MSA only for I6400.

IIRC qemu-mips should support MSA with mips32r5, but in qemu-mips64 MSA is only available for mips64r6.

The problem is that mips64r6 is not compatible with previous ISA iterations an cannot be linked together.

I think that the problem here is that I am trying to link mips64r6 with mips64r2 modules. I thought that the solution would be something like passing some option to the linker to link against mips64r6 modules, or choosing a different cross compilation toolchain that provides a linker for mips64r6.

But from the rest of your comment, I think that I misunderstood the issue.

I guess that I could fork the qemu and provide changes needed

Which changes do you mean? I'd guess that if qemu supported MSA for mips64r5, then we wouldn't have the linker issues. But at some point we might want to test mips64r6 anyways, so resolving the linker issues will need to be done anyways?

draganmladjenovic commented 6 years ago

I think that the problem here is that I am trying to link mips64r6 with mips64r2 modules. I thought that the solution would be something like passing some option to the linker to link against mips64r6 modules, or choosing a different cross compilation toolchain that provides a linker for mips64r6.

No no that's it. I think you can install *-mipsisa64r6el-linux-gnuabi64, libc6-dev-mips64r6-cross etc. But in the end for it to work the whole std crate needs to be recompiled because mips64r2 is binary incompatible with mips64r6.

Which changes do you mean?

I meant adding support for mips64r5 with MSA (at first glance it looks easy).

But at some point we might want to test mips64r6 anyways, so resolving the linker issues will need to be done anyways?

I guess. I think that using mips64r6 requires a new target definition for rustc [mipsisa64r6{el}-linux-gnuabi64].

gnzlbg commented 6 years ago

But in the end for it it to work the whole std crate needs to be recompiled because mips64r2 is binary incompatible with mips64r6.

Ah! Crap. I see what you mean now.

I think that using mips64r6 requires a new target definition for rustc [mipsisa64r6{el}-linux-gnuabi64].

We should do this anyways. Do you want to open an issue about this in rust-lang/rust and cross-reference it from here, or shall I do that?

I meant adding support for mips64r5 with MSA (at first glance it looks easy).

That would be cool. I'd prefer for your patch to be merged in qemu upstream. I can then fetch qemu from source/nightly and use that in the Docker containers, and then we can switch to the latest ubuntu version in those when the changes make it into a release.

draganmladjenovic commented 6 years ago

We should do this anyways. Do you want to open an issue about this in rust-lang/rust and cross-reference it from here, or shall I do that?

I would be very grateful if you do that.

I'd prefer for your patch to be merged in qemu upstream.

Is using a github fork a valid stopgap option until changes get to qemu upstream?

gnzlbg commented 6 years ago

Is using a github fork a valid stopgap option until changes get to qemu upstream?

Yes, that would be fine. Ideally we would just upload a Docker container with the fork pre-compiled somewhere, so that we don't have to fetch it and compile it on every travis run, but that can be done.

I would be very grateful if you do that.

Done: https://github.com/rust-lang/rust/issues/52666

draganmladjenovic commented 6 years ago

Hi I've talked to some people about this issue.

Short story: There won't be any mips64r5 + MSA hardware. So for mips64 + MSA the only reasonable thing to test is mips64r6 + MSA.

For mips32 the story is a bit complicated. It turns out that NAN2008 (target-feature="+nan2008") is hard requirement for MSA hardware along with FR=1 (target-feature="+fp64"), albeit not enforced by the compiler. When object is build with nan2008, it the taged as such and it is link incompatible with non-nan2008 objects. (There where some plans to make this more lax, but nothing is accepted into upstream binutils so far).

The Debian doesn't ship the nan2008 mips cross toolchain nor it provides nan2008 package builds. I was able to test a mips32r5 + MSA board and they seem to usually run with kernel option which disables checking of nan2008 flag allowing prebuilt non-nan2008 RFS to be run on them.

So defing mips32r5 + MSA - NAN2008 target for QEMU is out of option, but the providing the option for qemu-usermode to disable nan2008 flag checking might be accepted upstream.

Otherwise we would need use custom mips32 target with rebuilt sysroot for nan2008.

gnzlbg commented 6 years ago

What would need to be done to support a mips64r6 target in rustc upstream, and to be able to test the resulting binaries in qemu ? Is there maybe a cross-compilation toolchain that we could use for that?

draganmladjenovic commented 6 years ago

Hi, It seems that libraries and qemu are already included in Debian. Unfortunately I wasn't able to find a package build for gcc-mipsisa64r6-linux-gnuabi64. Will IMG LINUX toolchain form https://www.mips.com/develop/tools/codescape-mips-sdk/download-codescape-mips-sdk-essentials/ do work for you? For rustc I'm hoping we would need no more than new target description.

gnzlbg commented 6 years ago

Yes, that toolchain should work. Is there a Dockerimage containing the toolchain somewhere? If not, we can probably just download it in the mips64 Dockerimages and use it.

draganmladjenovic commented 6 years ago

Hi @gnzlbg, No, I don't believe so.