simd-everywhere / simde

Implementations of SIMD instruction sets for systems which don't natively support them.
https://simd-everywhere.github.io/blog/
MIT License
2.33k stars 241 forks source link

AltiVec implementations for SSE #74

Open nemequ opened 4 years ago

nemequ commented 4 years ago

I'm trying to get access to a VPS for developing AltiVec implementations, and SSE is first on the list. Even if I don't get access to the VPS it should be possible (though somewhat painful) to do it using QEMU since we can now verify the implementations on Travis CI.

I'm hoping it will also provide a better feeling for the AltiVec API and whether it is reasonable to implement a portable variant. Unfortunately they require some language-level support (i.e., the vector keyword), but if people are willing to tweak their code a bit it may be possible to provide something.

nemequ commented 4 years ago

I've added the necessary infrastructure, and implemented a the first function (simde_mm_add_ps).

Implementing additional functions should be pretty straightforward. For example, here are the changes which were necessary to simde_mm_add_ps:

diff --git a/simde/x86/sse.h b/simde/x86/sse.h
index f532373..b1fbbc4 100644
--- a/simde/x86/sse.h
+++ b/simde/x86/sse.h
@@ -222,6 +222,8 @@ simde_mm_add_ps (simde__m128 a, simde__m128 b) {
   r_.neon_f32 = vaddq_f32(a_.neon_f32, b_.neon_f32);
 #elif defined(SIMDE_SSE_WASM_SIMD128)
   r_.wasm_v128 = wasm_f32x4_add(a_.wasm_v128, b_.wasm_v128);
+#elif defined(SIMDE_SSE_POWER_ALTIVEC)
+  r_.altivec_f32 = vec_add(a_.altivec_f32, b_.altivec_f32);
 #elif defined(SIMDE_VECTOR_SUBSCRIPT_OPS)
   r_.f32 = a_.f32 + b_.f32;
 #else