Closed groscoe2 closed 1 year ago
I left some comments. Overall I think this is great. Thanks for putting in the work to make this change. It's elegant, it's clearly more sound than the previous implementation, and it doesn't impose any extra boilerplate on butterfly implementations. I only wish I had thought of doing it this way.
ooooo looks like we fail the MSRV 1.37 test.
I don't think that's an issue, our policy allows us to upgrade as long as it's not an upgrade to a very recent version. But we will need to decide what to upgrade to, and update the CI config, before we can mrge this.
Looks like the initial rust const generics release was 1.51, so we oculd upgrade it to that. Alternatively, we could say this is a good time to unify things, and upgrade everything to the rust version that neon requires, which is iirc rustc 1.63.
I left some comments. Overall I think this is great. Thanks for putting in the work to make this change. It's elegant, it's clearly more sound than the previous implementation, and it doesn't impose any extra boilerplate on butterfly implementations. I only wish I had thought of doing it this way.
Thank you for the comments! I believe I have addressed them all with the latest commits.
Looks like the initial rust const generics release was 1.51, so we oculd upgrade it to that. Alternatively, we could say this is a good time to unify things, and upgrade everything to the rust version that neon requires, which is iirc rustc 1.63.
I'm good with either of those options! We could also try to work around the const generics usage since there should be a reasonably straightforward way to coerce the arrays into mutable slices instead.
Looks like the initial rust const generics release was 1.51, so we oculd upgrade it to that. Alternatively, we could say this is a good time to unify things, and upgrade everything to the rust version that neon requires, which is iirc rustc 1.63.
I'm good with either of those options! We could also try to work around the const generics usage since there should be a reasonably straightforward way to coerce the arrays into mutable slices instead.
Before modifying the CI, could we get this merged? https://github.com/ejmahler/RustFFT/pull/101 It's using the unmaintained actions-rs now, which use the deprecated node 12. Don't know exactly when this will finally stop working, but it will happen.
Why is RawSlice
still used? I see it's used in the avx32_butterflies
module for example.
Gave a quick stab at checking if miri also detects UB for those tests but run looks like miri doesn't support avx instructions:
thread 'avx::avx64_butterflies::unit_tests::test_avx_butterfly5_f64' panicked at 'Can't run test because this machine doesn't have the required instruction sets: ()', src/avx/avx64_butterflies.rs:1565:5
Gave a quick stab at checking if miri also detects UB for those tests but run looks like miri doesn't support avx instructions:
thread 'avx::avx64_butterflies::unit_tests::test_avx_butterfly5_f64' panicked at 'Can't run test because this machine doesn't have the required instruction sets: ()', src/avx/avx64_butterflies.rs:1565:5
This error message isn't too useful heh
I upgraded the CI to test for 1.61
I'm working locally to switch avx and sse etc over to use the DoubleBuff struct, to hopefully completely eliminate the unsoundness. Once that's in I'll submit that to this PR and merge.
AVX, SSE, and Neon have now been converted, and I deleted the RawSlice struct.
There's a little too much copy/pasting of the SseArray and NeonArray trait imps for my taste. I tried making a macro for them, but it seems very hard to make a macro that borrows a different field of a struct, per macro invocation(Ie the impls for &mut [Complex<T>]
just want to call as_ptr()
on self
, but the impls for &mut DoubleBuff<'_, T>
want to call as_ptr()
on self.input
, and it's hard to write a macr othat lets you select one or the other per invocation).
But it's not worth holding up the PR to fix that.
I'm good with either of those options! We could also try to work around the const generics usage since there should be a reasonably straightforward way to coerce the arrays into mutable slices instead.
Separately from this PR, I've been waiting for const generics to replace the array_utils::iter_chunks
with the standard library's array_chunks::<N>
, I'm hoping it will result in the same codegen with less unsafe.
AVX, SSE, and Neon have now been converted, and I deleted the RawSlice struct.
There's a little too much copy/pasting of the SseArray and NeonArray trait imps for my taste. I tried making a macro for them, but it seems very hard to make a macro that borrows a different field of a struct, per macro invocation(Ie the impls for
&mut [Complex<T>]
just want to callas_ptr()
onself
, but the impls for&mut DoubleBuff<'_, T>
want to callas_ptr()
onself.input
, and it's hard to write a macr othat lets you select one or the other per invocation).But it's not worth holding up the PR to fix that.
I'm good with either of those options! We could also try to work around the const generics usage since there should be a reasonably straightforward way to coerce the arrays into mutable slices instead.
Separately from this PR, I've been waiting for const generics to replace the
array_utils::iter_chunks
with the standard library'sarray_chunks::<N>
, I'm hoping it will result in the same codegen with less unsafe.
Thank you for converting AVX, SSE, and Neon and thank you for merging the PR!
Regarding the SseArray and NeonArray trait impls, I put together PR #115 which does function pass-through implementation for DoubleBuf to remove some of the duplication. It also renames DoubleBuff to DoubleBuf to match typical Rust nomenclature and adds Deref/DerefMut to the AVX, SSE, and Neon traits to ensure that the buffers are modified in-place. This did require changing "buffer: &mut [Complex]" to "mut buffer: &mut [Complex]" in a couple of places. I am not super happy about this, but it does match the syntax of "mut buffer: impl LoadStore
Addresses issue #98 by eliminating the usage of RawSlice and RawSliceMut in /algorithm without introducing code duplication.
I created a new internal trait LoadStore and a new internal struct DoubleBuff. LoadStore contains functions load and store, mirroring the behavior of RawSlice and RawSliceMut. DoubleBuff contains an input and output slice, mirroring structure of RawSlice and RawSliceMut being combined into one object.
All the perform_fft_contiguous functions take a single buffer value that implement LoadStore.
Normal slices, arrays, and DoubleBuff all implement LoadStore.