simd-everywhere / simde

Implementations of SIMD instruction sets for systems which don't natively support them.
https://simd-everywhere.github.io/blog/
MIT License
2.37k stars 247 forks source link

_mm_malloc and _mm_free (feature request) #940

Open jeffhammond opened 2 years ago

jeffhammond commented 2 years ago

Many codes that use SSE and friends also use _mm_{malloc,realloc,free}.

SSE2Neon supports these (https://github.com/DLTcollab/sse2neon/pull/25/files).

The implementation is simple and I will try to contribute if no one else does it first.

Thanks.

thomasdwu commented 1 year ago

I was wondering whether there was any progress on this issue. I am trying to port a package from Intel, which can compile with with either SSE, AVX2, or AVX512 instructions, depending on the machine, to Arm. Can I simply replace _mm_malloc() with posix_memalign()?

jeffhammond commented 1 year ago

Yes. https://stackoverflow.com/q/32612881/2189128 has some context.

jeffhammond commented 1 year ago

You'll have to replace _mm_free as well, but just with free().

thomasdwu commented 1 year ago

Got it, thanks. I also see that posix_memalign() takes its argument in terms of bits, whereas _mm_malloc() takes it in terms of bytes.

jeffhammond commented 1 year ago

The alignment argument is in bytes from what I can tell.

The value of alignment shall be a power of two multiple of sizeof(void *).

https://pubs.opengroup.org/onlinepubs/9699919799/functions/posix_memalign.html

thomasdwu commented 1 year ago

Yes, you're right. Somehow, I was mistakenly thinking that addresses were in terms of bits.