rust-lang / portable-simd

The testing ground for the future of portable SIMD in Rust
Apache License 2.0
884 stars 80 forks source link

Fast round #421

Open NamorNiradnug opened 5 months ago

NamorNiradnug commented 5 months ago

.round() function is very slow compared to platform-native intrinsic on AVX (https://godbolt.org/z/3sdd9jrvW) because it provides a platform-agnostic behavior. Although there are many use cases when the exact behavior on half-way values or INFs and NaNs doesn't matter.

I think adding somewhat like round_fast function is reasonable.

programmerjake commented 5 months ago

what you want is usually round_ties_even (not yet available on Simd), since that usually compiles to a single instruction

e.g.: https://godbolt.org/z/Tb8xvzqo7

NamorNiradnug commented 5 months ago

what you want is usually round_ties_even (not yet available on Simd), since that usually compiles to a single instruction

Yet still there are maybe platforms where it's not the case. Or may be such an instruction is slower than another rounding instruction.

Although at least NEON, AVX and SSE all have round_ties_even instructions.

NamorNiradnug commented 5 months ago

e.g.: https://godbolt.org/z/Tb8xvzqo7

Thanks for a workaround!