arduano / simdeez

easy simd
MIT License
332 stars 25 forks source link

Compile only scalar fallbacks on unsupported targets #29

Open FlorianUekermann opened 4 years ago

FlorianUekermann commented 4 years ago

It is a bit hard to rely on this crate (and by extension the noise crate) in cross-platform libraries, because while scalar fallbacks exist, it still tries to compile the simd implementations for non-x86/amd64 targets. An example error is:

/xxx/.cargo/registry/src/github.com-1ecc6299db9ec823/simdeez-1.0.0/src/sse41.rs:644:26
    |
644 |                 I32x4_41(_mm_srli_epi32(a.0, $amt_const))
    |                          ^^^^^^^^^^^^^^ not found in this scope
...
647 |         constify_imm8!(amt_const, call)
    |         ------------------------------- in this macro invocation
    |
jackmott commented 4 years ago

can you try the current master branch and see if this is still an issue?

FlorianUekermann commented 4 years ago

I tried d66564f0f3 for a few popular targets and some worked, some didn't. Try rustup toolchain install nightly -t $T && cargo build --target $T with T=i686-unknown-linux-gnu or T=wasm32-unknown-unknown and you'll get different build errors. Maybe it would be good to implement/document a way to get scalar only builds by specifying some feature, or using default-features = false, so people can continue using the library even if some platform is broken or not yet supported.

jackmott commented 4 years ago

Ok I fixed up the i686 case, and it should fix most cases now since I'm doing the conditional includes based on target_feature now. wasm is broken because it just isn't done yet.

jackmott commented 4 years ago

Hmm, actually what I did screws up other things. I think your are right it may be necessary to manually turn features on and off.

FlorianUekermann commented 4 years ago

Does it? I was just about to close this. In any case, I guess adding

default = ["scalar","sse2", "sse41", "avx2"]
scalar = []
sse2 = []
sse41 = []
avx2 = []

to the features section and gating the implementations with something like #[cfg(all(feature="sse2", target_feature="sse2"))] can't hurt.

Looks like it allows reducing the compiled size quite a bit as well. Not sure if that is interesting.

Thanks for the great work btw. Looking forward to the next release of simdeez and simdnoise (especially if it includes #20).

jackmott commented 4 years ago

Yeah the problems comes up with runtime selection. Say you want a build that will run on any x86 machine, but will use AVX2 if available. Then then AVX2 feature won't exist, but you need to build that AVX2 code. If you build for an arch that has AVX2, then it won't run on any x86 machine any more, because llvm will use some avx2 instructions all over, not just in the functions we tag.

FlorianUekermann commented 4 years ago

I see, yes makes sense. So the "correct" solution is a slightly more complex architecture whitelist per extension? What was the underlying reason for the i686 build issue?

jackmott commented 4 years ago

Right, I think the problem is that the architecture flags (x86, x86_64 etc) aren't enough to tell you if SSE2 vs SSE4.1 etc are available. So probably need to make our own feature options.