Mixed-radix coset NTT seems to work incorrectly when ordering is MN (switching to other orderings, non-coset or radix2 makes the error disappear). Specifically, non fast twiddle version fails on sizes 9, 11, 19, 21 and 23. Fast twiddle version additionally fails for 13, 14 and 16.
To reproduce, checkout the branch I linked and run cargo test test_ntt_coset_from_subgroup from any of the curve crates.
Additional context
It's pretty clear that the issue is in reordering and not ntt itself, since all orderings except one work fine. Sizes I mentioned seem to be related to stage sizes in mixed radix ntt: failures in non fast twiddle version happen exactly where stage sizes are not symmetrical (i.e. "mixing" permutation has order bigger than 2). And additional failures for fast twiddles correspond to differences between fast twiddle and regular stage sizes. Maybe the problem is that the code assumes that dig_rev function must bring permutation to identity after two iterations, and in places non-fast twiddles stages are used instead of fast twiddles stages but I couldn't figure out where exactly.
Description
Mixed-radix coset NTT seems to work incorrectly when ordering is
MN
(switching to other orderings, non-coset or radix2 makes the error disappear). Specifically, non fast twiddle version fails on sizes 9, 11, 19, 21 and 23. Fast twiddle version additionally fails for 13, 14 and 16.Reproduce
I customised the coset ntt test to showcase the issue, pls see https://github.com/ingonyama-zk/icicle/blob/a9a2f07c61bca76fe7245b9a59ef767ec99c2099/wrappers/rust/icicle-core/src/ntt/tests.rs#L104
To reproduce, checkout the branch I linked and run
cargo test test_ntt_coset_from_subgroup
from any of the curve crates.Additional context
It's pretty clear that the issue is in reordering and not ntt itself, since all orderings except one work fine. Sizes I mentioned seem to be related to stage sizes in mixed radix ntt: failures in non fast twiddle version happen exactly where stage sizes are not symmetrical (i.e. "mixing" permutation has order bigger than 2). And additional failures for fast twiddles correspond to differences between fast twiddle and regular stage sizes. Maybe the problem is that the code assumes that
dig_rev
function must bring permutation to identity after two iterations, and in places non-fast twiddles stages are used instead of fast twiddles stages but I couldn't figure out where exactly.