bluss / matrixmultiply

General matrix multiplication of f32 and f64 matrices in Rust. Supports matrices with general strides.
https://docs.rs/matrixmultiply/
Apache License 2.0
209 stars 25 forks source link

Align mask buffer pointer manually #56

Closed bluss closed 3 years ago

bluss commented 3 years ago

New in 0.3.0 was that we are using TLS for the masking buffer, and using #[repr(align(32))] to 32-byte align it. It seems one of those choices didn't work out.

So we manually align the pointer into the buffer anyway, due to bugs we have seen on certain platforms (macos) that look like we don't get aligned allocations out of TLS (?).

Fixes #55 (Hopefully)

This removes the debug assertion we are hitting in #55, because here we make the alignment happen.

Halfway reverts previous commit 2ddd0ba04b8ff0f2f82fd3cf6459955fdd536be5

bluss commented 3 years ago

Thanks to @oracleofnj for confirming that the fix works there! https://github.com/rust-ndarray/ndarray/issues/960#issuecomment-816223287