jratcliff63367 / sse2neon

Automatically exported from code.google.com/p/sse2neon
284 stars 130 forks source link

Consider merging efforts with SIMDe project #9

Open nemequ opened 7 years ago

nemequ commented 7 years ago

I've been working on an eerily similar project, SIMDe, which is also MIT licensed, and is also an attempt to allow code written for one set of SIMD instructions to run on machines without them.

We're both working on implementing x86/x86_64 ISA extensions right now, but SIMDe is using portable fallbacks (with hints to encourage the compiler to vectorize what it can) instead of NEON instrisics. I have been planning to create a NEON backend for SIMDe eventually, but so far I've been focusing on getting the portable version in place. Eventually I also intend to go in the other direction with SIMDe: NEON (and others) to SSE (and everything else).

I'm wondering if you would be interested in merging the two projects.

jratcliff63367 commented 7 years ago

I'm wondering if you would be interested in merging the two projects.

Maybe? Not sure at this time. For now I'm using this code in several production projects and don't have bandwidth to make much if any changes right now. My main outstanding task has been to finish writing all of the remaining unit tests necessary. Unfortunately this is a 'low priority' task, so it stays pushed way back in the backlog because I always have higher priority stuff to work on.

nemequ commented 7 years ago

Maybe? Not sure at this time. For now I'm using this code in several production projects and don't have bandwidth to make much if any changes right now.

I've already finished merging sse2neon into SIMDe. The only functions I didn't merge are the _mm_shuffle* functions, but I don't think that's really an issue since SIMDe has implementations based on __builtin_shufflevector (clang) and __builtin_shuffle (GCC), and sse2neon's implementations depend on a GNU C extension anyways, so no portability is lost.

I was also able to add NEON implementations of quite a few other functions after looking at the implementations of similar functions in sse2neon, and of course many other functions are available with portable implementations which can often be automatically vectorized by compilers… currently MMX, SSE, SSE2, and SSE3 have complete implementations, plus a few functions from SSSE3 and SSE 4.1.

My main outstanding task has been to finish writing all of the remaining unit tests necessary. Unfortunately this is a 'low priority' task, so it stays pushed way back in the backlog because I always have higher priority stuff to work on.

SIMDe already has tests for almost everything. Some of the _mmset* don't have them, nor do the fence instructions, but overall coverage is pretty good. I'm pretty sure all the functions with implementations merged from sse2neon have tests.

Basically, if you're not interested in (or don't have time for) continuing development of the concept, there isn't really anything to do other than perhaps pointing people to SIMDe as the successor to sse2neon. If, OTOH, you are interested, I hope you'll consider working on SIMDe instead; I'd be happy to give you write access to the repo.

If you'd rather keep maintaining sse2neon as a separate project, feel free to grab any code from SIMDe you're interested in. 91c5eb384a51d2490e454d39392f3d6365743de0 and 09799f86ae37e77c1c988754f13da5138ffa94ad are probably a good place to start, but that list will probably grow quite a bit as time goes on.