arduano / simdeez

easy simd
MIT License
331 stars 25 forks source link

Create SIMD sin/cos/log etc functions as in agner fog's vector libraries #17

Open jackmott opened 4 years ago

jackmott commented 4 years ago

I would like to recreate these functions in simdeez: https://github.com/vectorclass/version2/blob/master/vectormath_trig.h https://github.com/vectorclass/version2/blob/master/vectormath_hyp.h https://github.com/vectorclass/version2/blob/master/vectormath_exp.h https://github.com/vectorclass/version2/blob/master/vectormath_common.h

such a porting effort could also be moved into a separate crate, other simd crates, so that other can easily just have vectorized maths functions available, without using this simdeez lib. This would be a really valuable thing from the community.

I think trig.h would be the most useful and make the most sense to port first. Someone with C++ experience helping with this would be good, since agner uses lots of C++ features I'm not familiar with.

jackmott commented 4 years ago

It looks like this crate does exactly what I am picturing: https://github.com/gnzlbg/sleef-sys

But it still needs avx2 support added, which apparently cannot proceed until rust adds loung double to their ffi support

I have a sleef branch of simdeez that uses it to expose a sin function

jackmott commented 4 years ago

as a stop gap for sleef avx2 support we can just do sse4 operations twice

Lokathor commented 4 years ago

Partial start here: https://github.com/Lokathor/wide/pull/13

so that other can easily just have vectorized maths functions available, without using simd.

I don't know what "vectorized without SIMD" is supposed to mean...

jackmott commented 4 years ago

@Lokathor typo! fixed. I meant without using simdeez, my lib.

Lokathor commented 4 years ago

ah! makes more sense.

Anyway, there's much more work to be done (only sin/cos/tan right now), and most of the driver for what i'm focusing on with the lib is whatever the ultraviolet dev says they need, so f32x4 is where all the energy goes at the moment.

I think I'll probably release the 0.3 as is once i've given it a second look over. If you'd like to use wide as a dependency that's cool. If you'd like to contribute more converted functions that's appreciated, just mention which ones you're starting on if you start doing a whole group of them so we don't accidentally duplicate a lot of work.

greatest-ape commented 4 years ago

as a stop gap for sleef avx2 support we can just do sse4 operations twice

I actually got it working, see https://github.com/gnzlbg/sleef-sys/issues/16#issuecomment-590086460.