JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.45k stars 5.46k forks source link

round to digits incorrectly rounded #33103

Open Keno opened 5 years ago

Keno commented 5 years ago

There was some discussion today that rounding in python (but not numpy) used the float printing code rather than numerical manipulation. One of the HN comments mentioned the following example as something numpy got wrong, and it appears we do too:

julia> round(56294995342131.5, digits=3)
5.629499534213151e13

Note that 56294995342131.5 is exact, so there aren't any lower order digits that are leading to an unintuitive result here.

StefanKarpinski commented 5 years ago

I'm not sure how to fix this, but separating x into it's integer and fractional parts might help:

julia> x = 5.62949953421315e13
5.62949953421315e13

julia> y = round(x)
5.6294995342132e13

julia> f = x - y
-0.5

julia> round(f*10^n)/10^n
-0.5

julia> y + f
5.62949953421315e13

Also probably helps avoid overflow.

simonbyrne commented 4 years ago

The first challenge is to define "correct": the closest I can come up with (and what Python appears to use) is "round to the correct number of digits, then convert to the nearest float". Note that this is still going to cause strange behavior, especially with floor, e.g.

julia> floor(0.295, digits=2)
0.29

julia> floor(floor(0.295, digits=2), digits=2)
0.28

As to actually implementing it: the easiest option would be to print then parse it back (unfortunately, we don't have a convenient way to "print to n digits"). It may be possible to skip the explicit character conversions and reuse Ryu directly. @quinnj may be able to provide advice here (since i'm not that familiar with the code).

JeffreySarnoff commented 4 years ago

(topic adjacent) Rounding for display requires round(x, sigdigits=n) provide values that read appropriately. It is very hard to get it right in all cases. Here may be an approach at 7x time. The number of sigdigits to which one may round is a few less than the maximum number of significant digits for any supported by a type. It is limited to the minimax (as it were) number of significant digits: e.g. rounding of a Float64 value is limited to 15 sigdigits.

Following this is another, 3x slower approach that uses string construction.

sigdigitsmax(::Type{Float64}) = 15
sigdigitsmax(::Type{Float32}) =  7
sigdigitsmax(::Type{Float16}) =  3

function round10(x::T; sigdigits::Int) where {T<:AbstractFloat}
      if !(0 < sigdigits <= sigdigitsmax(T))
          throw(DomainError("sigdigits=$sigdigits"))
      end
      absx = abs(x)
      res = round(absx, sigdigits=sigdigits)
      alt = prevfloat(res)
      reslen = length(string(res))
      altlen = length(string(alt))
      altlen < reslen && return copysign(alt, x)
      alt = nextfloat(res)
      altlen = length(string(alt))
      altlen < reslen && return copysign(alt, x)
      return copysign(res, x)
end

Here may be an approach [at least for normal values] at 20x time (2μs/100ns) that constructs the rounded value as a string then parses it. The number of sigdigits to which one may round is a few less than the maximum number of significant digits for any supported by a type. It is limited to the minimax (as it were) number of significant digits: e.g. rounding of a Float64 value is limited to 15 sigdigits.

# rounding to show significant decimal digits exactly

#=
example

julia> x = inv(ldexp(float(pi), 100))
2.5110222495574236e-31

julia> [round(x, sigdigits=i) for i=1:15]
15-element Array{Float64,1}:
 3.0000000000000003e-31
 2.4999999999999998e-31
 2.51e-31
 2.511e-31
 2.5110000000000003e-31
 2.51102e-31
 2.511022e-31
 2.5110222e-31
 2.5110222499999997e-31
 2.51102225e-31
 2.5110222496e-31
 2.5110222495600003e-31
 2.511022249557e-31
 2.5110222495574e-31
 2.51102224955742e-31

 julia> [round10(x, sigdigits=i) for i=1:15]
 15-element Array{Float64,1}:
  3.0e-31
  2.5e-31
  2.51e-31
  2.511e-31
  2.511e-31
  2.51102e-31
  2.511022e-31
  2.5110222e-31
  2.51102225e-31
  2.51102225e-31
  2.5110222496e-31
  2.51102224956e-31
  2.511022249557e-31
  2.5110222495574e-31
  2.51102224955742e-31
=#
#=
    sigdigitsmax(T) = trunc(Int, log10(2.0^sigbits(T)))
    sigbits(T) = trailining_ones(Base.significand_mask(T)) + 1 (hidden bit)
=#
sigdigitsmax(::Type{Float64}) = 15
sigdigitsmax(::Type{Float32}) =  7
sigdigitsmax(::Type{Float16}) =  3

function round10(x::T; sigdigits::Int) where {T<:AbstractFloat}
    if !(0 < sigdigits <= sigdigitsmax(T))
        throw(DomainError("sigdigits=$sigdigits"))
    end
    absx = abs(x)
    powten = round10exponent(absx)
    sigten = round10significand(absx, powten)
    sigten = round(sigten, sigdigits=sigdigits)
    sigten = copysign(sigten, x)
    str = string(sigten, "e", powten)
    return parse(Float64, str)
end

function round10exponent(x::T) where {T<:AbstractFloat}
    absx = abs(x)
    iszero(absx) && return 0
    powten = floor(log10(absx))
    delta = absx - T(10.0)^powten
    powten -= signbit(delta)      # normalize pow10 exponent
    return Int(powten)
end

# result is assured only to sigdigitsmax(T) significant digits
function round10significand(x::T) where {T<:AbstractFloat}
    powten = expon10(x)
    return signif10(x, powten)
end

function round10significand(x::T, powten::Int) where {T<:AbstractFloat}
    iszero(x) && return x
    absx = abs(x)
    sigten = absx / T(10.0)^powten
    sigten = flipsign(sigten, x)
    return sigten
end