Closed yurivish closed 11 months ago
For example, using quantile to compute the maximum of the array [-128, 127] incorrectly returns -129 if the array consists of Int8 values:
quantile
[-128, 127]
-129
Int8
julia> using Statistics julia> quantile([-128, 127], 1) # right 127 julia> quantile(Int8[-128, 127], 1) # wrong -129
This happens because of the following code in _quantile:
_quantile
https://github.com/JuliaStats/Statistics.jl/blob/0588f2cf9e43f9f72af5802feaf0af4b652c3257/src/Statistics.jl#L1010
For the above array, that line computes the following expression, causing the incorrect result:
julia> Int8(-128) + Int64(Int8(127) - Int8(-128)) -129
This behavior occurs with arrays of Int8, Int16, and Int32:
Int16
Int32
julia> function test(type) A = [typemin(type), typemax(type)] result = quantile(A, 1) correct_result = quantile(map(Int, A), 1) (; result, correct_result) end; julia> test(Int8) (result = -129, correct_result = 127) julia> test(Int16) (result = -32769, correct_result = 32767) julia> test(Int32) (result = -2147483649, correct_result = 2147483647)
Using quantile to compute the median also returns an answer that is not correct:
julia> quantile(Int8[-128, 127], .5) -128.5
The equivalent function in NumPy, np.quantile, returns the correct result in all cases:
julia> using PyCall julia> np = pyimport("numpy"); julia> a = np.array(Int8[-128, 127], dtype=np.int8); julia> np.quantile(a, 1) 127 julia> function test_python() quantiles = (1, 0.5) types = ( Int8 => np.int8, Int16 => np.int16, Int32 => np.int32 ) for q in quantiles, (type, dtype) in types A = np.array([typemin(type), typemax(type)]; dtype) result = np.quantile(A, q) correct_result = quantile(map(Int, A), q) println((; q, type, result, correct_result)) end end; julia> test_python() (q = 1, type = Int8, result = 127, correct_result = 127) (q = 1, type = Int16, result = 32767, correct_result = 32767) (q = 1, type = Int32, result = 2147483647, correct_result = 2147483647) (q = 0.5, type = Int8, result = -0.5, correct_result = -0.5) (q = 0.5, type = Int16, result = -0.5, correct_result = -0.5) (q = 0.5, type = Int32, result = -0.5, correct_result = -0.5)
Fixed by https://github.com/JuliaStats/Statistics.jl/pull/145.
For example, using
quantile
to compute the maximum of the array[-128, 127]
incorrectly returns-129
if the array consists ofInt8
values:This happens because of the following code in
_quantile
:https://github.com/JuliaStats/Statistics.jl/blob/0588f2cf9e43f9f72af5802feaf0af4b652c3257/src/Statistics.jl#L1010
For the above array, that line computes the following expression, causing the incorrect result:
This behavior occurs with arrays of
Int8
,Int16
, andInt32
:Using
quantile
to compute the median also returns an answer that is not correct:The equivalent function in NumPy, np.quantile, returns the correct result in all cases: