rust-ndarray / ndarray-stats

Statistical routines for ndarray
https://docs.rs/ndarray-stats
Apache License 2.0
197 stars 24 forks source link

map_axis_skipnan_mut invalid result #89

Closed JacekCzupyt closed 2 years ago

JacekCzupyt commented 2 years ago

Description The map_axis_skipnan_mut method returns invalid results for any non-maximal axis.

Version Information

To Reproduce

let mut a = arr2(&[[1., 2., f64::NAN, 4.], [5., 6., 7., 8.], [9., 10., 11., f64::NAN]]);
println!("Initial array:\n{}", &a);

println!("\nLanes:");
let b = a.map_axis_skipnan_mut(Axis(0), |x| { println!("{}", x); x});
println!("\nResults:\n{}", b);

Expected behavior As far as I understand, the described behavior, consistent with the map_axis_mut method behavior, would give the following result:

Initial array:
[[1, 2, NaN, 4],
 [5, 6, 7, 8],
 [9, 10, 11, NaN]]

Lanes:
[1, 5, 9]
[2, 6, 10]
[11, 7]
[4, 8]

Results:
[[1, 5, 9], [2, 6, 10], [11, 7], [4, 8]]

Actual behavior

Initial array:
[[1, 2, NaN, 4],
 [5, 6, 7, 8],
 [9, 10, 11, NaN]]

Lanes:
[1, 2, NaN]
[2, NaN, 4]
[11, 4]
[4, 5]

Results:
[[1, 2, 11], [2, 11, 4], [11, 4], [4, 5]]

This behavior appears to be clearly invalid. It does not skip nan values, and the elements present in each lane are not consistent with the selected axis. Applying some methods to the lanes (such as x.sum()) can cause a panic: thread 'main' panicked at 'unexpected NaN', .../.cargo/registry/src/github.com-1ecc6299db9ec823/noisy_float-0.2.0/src/checkers.rs:30:9, as the lanes include NotNan types with a value of nan.

Additional context When no nan values are present in the initial array, the result should be identical (besides the NotNan type) to the map_axis_mut function. However it instead returns slices of what seems to be a flattened version of the initial array.

let mut a = arr2(&[[1., 2., 3., 4.], [5., 6., 7., 8.], [9., 10., 11., 12.]]);

Result:

Lanes:
[1, 2, 3]
[2, 3, 4]
[3, 4, 5]
[4, 5, 6]

The function works as expected, when the axis argument is the largest valid axis (a.ndim()-1).

It appears that while the content of the lanes is invalid, the number of elements in each lane is always correct. My best guess is that the error originates in the remove_nan_mut function, specifically the slice created in line 63 probably ignores the axis of the ArrayView.

jturner314 commented 2 years ago

Thanks for reporting this and writing such a nice bug report! I've fixed the bug and released ndarray-stats 0.5.1.