Lokathor / wide

A crate to help you go wide. By which I mean use SIMD stuff.
https://docs.rs/wide
zlib License
251 stars 22 forks source link

Add aarch64 support along with emulator build as action #124

Closed mcroomp closed 1 year ago

mcroomp commented 1 year ago

Added neon intrinsics support for aarch64, along with emulator for testing.

Don't support armv7 at this point since it is unstable.

Lokathor commented 1 year ago

This looks good so far, my main comment would be that newly added function/methods should generally be marked as #[inline] and also #[must_use]

mcroomp commented 1 year ago

You mean the Default and PartialEq trait? Other than that I don't think I added any functions

Lokathor commented 1 year ago

Yep, even trait methods need to be specifically marked with #[inline] to make them eligible for inlining across codegen units when there's no generics.

mcroomp commented 1 year ago

Yep, even trait methods need to be specifically marked with #[inline] to make them eligible for inlining across codegen units when there's no generics.

I wonder about implementing the 256bit versions... do you think it might make more sense for the default implementation of the 256bit version (eg i16x16) to just delegate to two i16x8? The only cases that makes sense to specialize for these would be AVX2, but SSE2 and scalar compile to exactly the same code.

Lokathor commented 1 year ago

Yeah that makes sense.

mcroomp commented 1 year ago

I think this is good to go now. I merged some conflicts with the previous change and also made it only use neon on aarch64, since armv7 also supports neon but causes compile issues since it is unstable.

mcroomp commented 1 year ago

Thank you! Let me know when you publish the crate... can't wait to start using the new features :)

Lokathor commented 1 year ago

I think we'll wait a little longer to sort out https://github.com/Lokathor/wide/pull/127 and then things should be fine to publish.