Lokathor / wide

A crate to help you go wide. By which I mean use SIMD stuff.
https://docs.rs/wide
zlib License
279 stars 23 forks source link

NEON instructions on `aarch64`? #115

Closed torokati44 closed 1 year ago

torokati44 commented 1 year ago

I know the README says this:

... and on other architectures this is done by carefully writing functions so that LLVM hopefully does the right thing

But what magic incantation do I have to yell at it to make it actually vectorize my code written using wide? :/ Is NEON never emitted by rustc unless directly using the arch intrinsics?

Lokathor commented 1 year ago

https://rust-lang.github.io/packed_simd/perf-guide/target-feature/rustflags.html#target-feature

"+neon" should do it

torokati44 commented 1 year ago

Yeah, thanks, I knew about that! And sorry, I should have mentioned that right at the beginning... The thing is, the neon target feature is actually enabled by default on aarch64-linux-android:

$ rustc --target aarch64-linux-android --print cfg
[...]
target_feature="neon"
target_feature="pmuv3"
[...]

So adding that flag won't do anything.

But even if I do add it, I get autovectorization on x86_64, but not on aarch64: https://godbolt.org/z/91TrexsvP Same thing with C++, with either Clang or GCC: https://godbolt.org/z/8qqfjWd7f

Lokathor commented 1 year ago

I'm not sure much else can be done right now unfortunately.

Other than like, join the portable simd working group and help them get that api to stable faster

torokati44 commented 1 year ago

Yeah, that's one way, I guess... :/

But really, I shared your hope that LLVM would take care of it on its own, and I find it really weird that it does not, given how popular and important aarch64 is becoming...

mcroomp commented 1 year ago

aarch64 neon support is now in version 0.7.7

Lokathor commented 1 year ago

Actually, the main branch's Cargo.toml version doesn't get bumped until the end of development, so it's 0.7.8

But I released wide-0.7.8 just now, so it should be available by the time anyone reads this.