Open jackmott opened 4 years ago
It looks like this crate does exactly what I am picturing: https://github.com/gnzlbg/sleef-sys
But it still needs avx2 support added, which apparently cannot proceed until rust adds loung double to their ffi support
I have a sleef branch of simdeez that uses it to expose a sin function
as a stop gap for sleef avx2 support we can just do sse4 operations twice
Partial start here: https://github.com/Lokathor/wide/pull/13
so that other can easily just have vectorized maths functions available, without using simd.
I don't know what "vectorized without SIMD" is supposed to mean...
@Lokathor typo! fixed. I meant without using simdeez, my lib.
ah! makes more sense.
Anyway, there's much more work to be done (only sin/cos/tan right now), and most of the driver for what i'm focusing on with the lib is whatever the ultraviolet dev says they need, so f32x4 is where all the energy goes at the moment.
I think I'll probably release the 0.3 as is once i've given it a second look over. If you'd like to use wide
as a dependency that's cool. If you'd like to contribute more converted functions that's appreciated, just mention which ones you're starting on if you start doing a whole group of them so we don't accidentally duplicate a lot of work.
as a stop gap for sleef avx2 support we can just do sse4 operations twice
I actually got it working, see https://github.com/gnzlbg/sleef-sys/issues/16#issuecomment-590086460.
I would like to recreate these functions in simdeez: https://github.com/vectorclass/version2/blob/master/vectormath_trig.h https://github.com/vectorclass/version2/blob/master/vectormath_hyp.h https://github.com/vectorclass/version2/blob/master/vectormath_exp.h https://github.com/vectorclass/version2/blob/master/vectormath_common.h
such a porting effort could also be moved into a separate crate, other simd crates, so that other can easily just have vectorized maths functions available, without using this simdeez lib. This would be a really valuable thing from the community.
I think trig.h would be the most useful and make the most sense to port first. Someone with C++ experience helping with this would be good, since agner uses lots of C++ features I'm not familiar with.