Closed LukeMathWalker closed 5 years ago
I started reviewing this PR. This is great work. I created LukeMathWalker/ndarray-stats#1 with a few changes.
I had a couple of thoughts:
I think we should be able to improve on the computational complexity of get_many_from_sorted_mut_unchecked
(which AFAICT currently has an average case complexity of O(q*n)
where q
is the number of quantiles and n
is the size of the array).
The primary observation is that the current implementation throws away information from within get_from_sorted_mut
. In other words, when we compute the first quantile with get_from_sorted_mut
, we're calling partition_mut
at randomly generated pivot_index
es and shrinking the remaining size each time. When we go to compute the next quantile, the only information we're using from the first quantile's computation is its index; we're discarding all of the previous partition_mut
calls.
We should be able to avoid discarding this information with the following algorithm:
Pick a random pivot_index
and partition the array with let partition_index = self.partition_mut(pivot_index)
.
Partition the search indices by partition_index
.
Now, we've split the array and search indices into two pieces: (1) the portion of the array before partition_index
and the search indices less than partition_index
and (2) the portion of the array after partition_index
and the search indices above partition_index
.
Any indices that are equal to partition_index
are finished and should be removed from further recursion.
For each piece, we have a recursive call (go to (i) for each piece).
(Continue until we run out of indices for which we haven't found the value.)
I think this algorithm has an average case computational complexity of O((n + q)*log(q))
, which is better than O(q*n)
assuming log(q) < n
. AFAICT, the primary disadvantage is that it would be more complex to implement.
Does this make sense? Do you think this would be better?
The IndexMap
return types seem inconvenient to use. IMO, get_many_from_sorted_mut
should return Vec<A>
(where the order matches indexes
), quantiles_mut
should return Option<Vec<A>>
(where the order matches qs
), and quantiles_axis_mut
should return Option<Array<A, D>>
(where the order along the axis matches qs
).
Edit: It would be good to rebase this branch off the latest master to resolve merge conflicts and incorporate #28.
I have merged your PR - all useful additions/style changes!
I can't confirm the average complexity you estimated for your alternative implementation, but it intuitively looks faster and log(q) < n
is true for all relevant cases I'd say. I'll give it a go in a separate branch and then we can run some benchmarks :+1:
Re:IndexMap - I think it's a matter of preference, I usually find it error-prone to match input/output indexes, the way NumPy forces you to work sometimes, an IndexMap looks more ergonomic to me . The solid pro I can see in returning an Option<Array<A, D>>
is that you can do cross-quantile computation more easily, that is significant.
@jturner314 I have implemented the algorithm you have described - it's definitely more efficient than the one I came up with before. Please give it a look and let me know what you think.
I haven't finished reviewing this PR, but I've created LukeMathWalker/ndarray-stats#5 with some changes. I've added a couple of benchmarks and worked on improving performance of get_many_from_sorted_mut_unchecked
(plus a few additional things). Eliminating allocations from _get_many_from_sorted_mut_unchecked
improved performance on small arrays by 30-45%. (For larger arrays, the difference is less noticeable.)
I finished reviewing this PR and added some more changes to LukeMathWalker/ndarray-stats#5. The primary additional changes are:
The bulk quantiles methods now return an Array
instead of an IndexMap
. Option<Array<A, D>>
is easier to work with than Option<IndexMap<N64, Array<A, D::Smaller>>>
. It also needs only one heap allocation and should have better performance for most operations.
I've added the interpolation strategy as an explicit parameter to the quantile methods. See the commit message for a list of the advantages.
I've changed get_many_from_sorted_mut
and the bulk quantiles methods to take the indices as an array instead of a slice. This is a little more versatile because arrays can have arbitrary strides. It also seems more consistent with the rest of the API.
With change (1), we can now change qs
back to f64
instead of N64
if desired. I think this would probably be a good idea just because most things work with f64
instead of N64
.
What do you think?
I think it makes sense to take interpolate
as a parameter - in a recent conversation with @xd009642 it turned out that it would be nice to expose EquiSpaced
as a strategy, but given that it requires some array-independent parameters (e.g. number of bins), it was troublesome to get it to work with the existing arrangement. Nothing to say on the other two changes.
I'd keep N64
- I think that we should leverage the expressiveness of the type system to communicate constraints, if it doesn't add complexity or hinders readability of our API/the code using it.
I have merged master and aligned return types (Result
instead of Option
).
I have used InvalidQuantile
in the end - let me know what you think @jturner314.
Yay! :tada: Great job on these PRs @LukeMathWalker!
Thanks for your help @jturner314 - you always manage to make them much better :pray:
Using the ordering guarantees we have on the output of
quantile_mut
/sorted_get_mut
, it provides a method optimized to compute multiple quantiles at once (by scanning an increasingly smaller subset of the original array, thanks to the computation of the previous quantile).Breaking changes:
f64
toN64
- floats are not hashable, hence they cannot be used as keys inIndexMap
and the function panics anyway if that argument isNaN
. This change propagates to theInterpolate
types;sorted_get_mut
toget_from_sorted
. It plays better with the bulk version,get_many_from_sorted
, and I think it's clearer;quantile_axis_mut
andquantile_axis_skipnan_mut
now return anOption
instead of panicking if the axis length is 0.