JuliaStats / Statistics.jl

The Statistics stdlib that ships with Julia.
https://juliastats.org/Statistics.jl/dev/
Other
72 stars 40 forks source link

Improve numerical stability of `cov` #85

Closed nalimilan closed 3 years ago

nalimilan commented 3 years ago

cov(::AbstractVector, ::AbstractVector) was less stable than other cov methods and than var on arrays as it called sum on a generator, which does not use pairwise summation. Use a Broadcasted object instead to benefit from pairwise summation.

Fixes https://github.com/JuliaLang/Statistics.jl/issues/83.

Performance also seems to increase a bit:

# Before
julia> A = 20*randn(Float32, 10_000_000) .+ 100;

julia> @btime cov(A, A);
  42.300 ms (5 allocations: 76.29 MiB)

# After
julia> @btime cov(A, A);
  33.832 ms (5 allocations: 76.29 MiB)
codecov[bot] commented 3 years ago

Codecov Report

Merging #85 (e4a068d) into master (54f9b0d) will decrease coverage by 1.28%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #85      +/-   ##
==========================================
- Coverage   98.18%   96.89%   -1.29%     
==========================================
  Files           1        1              
  Lines         386      419      +33     
==========================================
+ Hits          379      406      +27     
- Misses          7       13       +6     
Impacted Files Coverage Δ
src/Statistics.jl 96.89% <100.00%> (-1.29%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 54f9b0d...e4a068d. Read the comment docs.