Is your feature request related to a problem? Please describe.
PR #107 drastically optimizes collapsed for loops of dimension 2 and 3. Dimensionality of 4 or higher remains unoptimized. This is an easy extreme performance gain.
Describe the solution you'd like
Implement manual iteration in the ForAction.ComputeIndicesN method, similar to ComputeIndices2 and ComputeIndices3. Some benchmarks highlighting the performance gains would be excellent.
Describe alternatives you've considered
N/A
Additional context
Optimizing collapse(2) had well over a 2x performance boost for removing a single DivRem each iteration. For collapse(4), there are four DivRems that must be executed each iteration. Removing all of those with cheap increment and assignment operations should yield unprecedented performance improvements.
Is your feature request related to a problem? Please describe. PR #107 drastically optimizes collapsed for loops of dimension 2 and 3. Dimensionality of 4 or higher remains unoptimized. This is an easy extreme performance gain.
Describe the solution you'd like Implement manual iteration in the
ForAction.ComputeIndicesN
method, similar toComputeIndices2
andComputeIndices3
. Some benchmarks highlighting the performance gains would be excellent.Describe alternatives you've considered N/A
Additional context Optimizing
collapse(2)
had well over a 2x performance boost for removing a singleDivRem
each iteration. Forcollapse(4)
, there are fourDivRem
s that must be executed each iteration. Removing all of those with cheap increment and assignment operations should yield unprecedented performance improvements.