rust-ndarray / ndarray-stats

Statistical routines for ndarray
https://docs.rs/ndarray-stats
Apache License 2.0
201 stars 25 forks source link

quantile_mut: fatal runtime error: stack overflow #86

Open sjackman opened 2 years ago

sjackman commented 2 years ago

Description quantile_mut can fail with the error message:

thread 'main' has overflowed its stack
fatal runtime error: stack overflow

Version Information

To Reproduce

use ndarray::Array1;
use ndarray_stats::{interpolate::Linear, Quantile1dExt};
use noisy_float::types::{n64, N64};

fn main() {
    {
        let mut array: Array1<N64> = Array1::ones(15300);
        println!("One {}", array.quantile_mut(n64(0.5), &Linear).unwrap());
    }

    {
        let mut array: Array1<N64> = Array1::ones(15600);
        println!("Two {}", array.quantile_mut(n64(0.5), &Linear).unwrap());
    }

    {
        let mut array: Array1<N64> = Array1::ones(100000);
        println!("Three {}", array.quantile_mut(n64(0.5), &Linear).unwrap());
    }
}

Observed behavior

$ cargo run --profile=dev
One 1

thread 'main' has overflowed its stack
fatal runtime error: stack overflow
$ cargo run --profile=release
One 1
Two 1

thread 'main' has overflowed its stack
fatal runtime error: stack overflow

Expected behavior

One 1
Two 1
Three 1

Additional context

sjackman commented 2 years ago

Exceeding the stack limit suggests to me either infinite recursion or a large data structure is being placed on the heap. I don't see an obvious source of either of those causes in the code. It's possible that more generalized memory corruption could cause this behaviour. https://github.com/rust-ndarray/ndarray-stats/blob/9ed937302682b1883128d920e0157615b830778b/src/quantile/mod.rs#L437-L498

sjackman commented 2 years ago

https://github.com/rust-ndarray/ndarray-stats/blob/b6628c6a1c143532673213c56d46da5fda80cbe8/src/sort.rs#L293 Looks like this recursive function call _get_many_from_sorted_mut_unchecked is recursing probably not infinitely but once per size of the vector, and blowing through the stack, perhaps because all the elements in the array are equal. (thanks to @evolvedmicrobe for discovering)

jturner314 commented 2 years ago

I think this issue may be fixed by #80; I'll take another look at that PR this weekend to see if we can get it merged.

jturner314 commented 2 years ago

I checked, and #80 doesn't fix this issue as-is. It's a bit tricky to avoid stack overflow with the recursive algorithm for large arrays with all equal elements. At first glance, I suspect that the best fix would be for partitioning to handle == pivot and > pivot elements separately, instead of grouping them together. Please feel free to contribute a fix. Otherwise, I'll work on it when I get a chance (which may be a while).

sjackman commented 2 years ago

Thank you for checking, Jim! I have a work around for now using slice:sort_unstable. I'd like to get to fixing this issue, but it similarly may be a while. I'll post my workaround below, in case anyone else stumbles on this issue.

/// Return the median. Sorts its argument in place.
pub fn median_mut<T>(xs: &mut Array1<T>) -> Result<T, QuantileError>
where
    T: Clone + Copy + Ord + FromPrimitive,
    T: Add<Output = T> + Sub<Output = T> + Mul<Output = T> + Div<Output = T> + Rem<Output = T>,
{
    if false {
        // quantile_mut may fail with the error: fatal runtime error: stack overflow
        // See https://github.com/rust-ndarray/ndarray-stats/issues/86
        xs.quantile_mut(n64(0.5), &Midpoint)
    } else {
        if xs.is_empty() {
            return Err(QuantileError::EmptyInput);
        }
        xs.as_slice_mut().unwrap().sort_unstable();
        Ok(if xs.len() % 2 == 0 {
            (xs[xs.len() / 2] + xs[xs.len() / 2 - 1]) / (T::from_u64(2).unwrap())
        } else {
            xs[xs.len() / 2]
        })
    }
}