JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.88k stars 5.49k forks source link

We need to provide these IEEE754-2019 functions #32877

Open JeffreySarnoff opened 5 years ago

JeffreySarnoff commented 5 years ago

from IEEE 754-2019: sec 9.2 Additional mathematical operations

"Language standards should define, to be implemented according to this subclause, as many of the operations in Table 9.1 as is appropriate to the language. As with other operations of this standard, the names of the operations in Table 9.1 do not necessarily correspond to the names that any particular programming language would use. In this subclause the domain of an operation is that subset of the affinely extended reals for which the corresponding mathematical function is well defined. A conforming operation shall return results correctly rounded for the applicable rounding direction for all operands in its domain."

these functions include

rsqrt(x) ≝ 1 / sqrt(x)
compound(x, n) ≝ (1 + x)^n

rootn(x, n) ≝ x^(1/n)
pown(x,n) ≝ x^n
pow(x, y) ≝ x^y [-Inf, +Inf] x [-Inf, +Inf] 
powr(x,y) ≝ x^y [0,+Inf] x [-Inf, +Inf]

exp2m1(x)  ≝ 2^x - 1
exp10m1(x) ≝ 10^x - 1

log2p1(x)  ≝ llog2(1+x)
log10p1(x) ≝ log10(1+x)

sinpi(x) ≝ sin(π * x)
cospi(x) ≝ cos(π * x)
tanpi(x) ≝ tan(π * x)

asinpi(x) ≝ asin(x) / π
acospi(x) ≝ acos(x) / π
atanpi(x) ≝ atan(x) / π

atan2pi(y, x) is the angle subtended at the origin by the point (x, y)
    and the positive x-axis. The range of atan2pi is [−1, +1].

there are short tables of required y = f(x) where x in {±0, ±∞} for many of these functions

getpayload(x)
    if the source operand is a NaN, the result is the payload
        as a non-negative floating-point integer.
    if the source operand is not a NaN, the result is −1.

setpayload(x)
    If the source operand is a non-negative floating-point integer whose value
        is one of an implementation-defined set of admissible payloads for
        the operation, the result is a quiet NaN with that payload. 
    Otherwise, the result is +0, with a preferred exponent of 0.

setpayloadsignaling(x)
    like setpayload, the result is a signaling NaN
JeffBezanson commented 5 years ago

See also #6148

mschauer commented 5 years ago

Is there a corresponding section for constants which should defined?

JeffreySarnoff commented 5 years ago

There is no such section in the current standard; however some constants should be available at their most accurate values for given representations and under the various rounding mode to comply best. These come to mind (and there are others which would be advantageous):

π, sqrt(2), exp(1), log(2), log(10), log2(10), log10(2), and their inverses.

A finite IEEE754 floating point value is constructible using ldexp(significand, exponent) where significand, exponent = frexp(value). It would be very helpful to return the nearest binary representation to a decimal significand with a radix 10 exponent e.g. frexp(value, base=10) .. and similarly with ldexp.

JeffreySarnoff commented 5 years ago

also missing is negate(x)

negate(x) copies a floating-point operand x to a destination in the same format, reversing the sign bit. negate(x) is not the same as subtraction(0, x). [negate(NaN) forces msb, 0.0 - NaN leaves msb] -- IEEE754-2019 (also in 2008 rev, maybe in original)

mschauer commented 5 years ago

Should -x call 0 - x or negate?

JeffreySarnoff commented 5 years ago

In what follows, Float16 is used as an example. The same holds for Float32 and Float64.

-x where x::Float16 should be processed as negative(x): corrected

negate(x::Float16) = reinterpret(Float16, reinterpret(UInt16, x) | 0x8000)
changesign(x::Float16) = reinterpret(Float16, reinterpret(UInt16, x) ⊻ 0x8000) 
negative(x::Float16) = isnan(x) ? x : changesign(x)

with the above this (which is important for proper propagation of NaNs) would hold

msbit(x::Float16) = reinterpret(UInt16, x) & 0x8000
isnan(x) && (msbit(x) === msbit(-x))

[it does not in Julia v1.2]. The relationship is important. The standard speaks to it:

When either an input or result is NaN, this standard does not interpret the sign of a NaN. Note, however, that operations on bit strings—copy, negate, abs, copySign—specify the sign bit of a NaN result, sometimes based upon the sign bit of a NaN operand.

immediateNaN = NaN
arithmeticNaN = 0 / 0

signbit(immediateNaN) !== signbit(arithmeticNaN)  # otherwise the bits are identical

[this signbit distinction holds in Julia 1.2]

signbit(immediateNaN) === false
signbit(arithmeticNaN) === true

with that, proper NaN propagation over arithmetic expressions needs two things (i) NaN1 op NaN2 --> consistently either NaN1, NaN2, or the NaN of larger payload (Julia 1.2 has NaN1 op NaN2 --> NaN1 for +, -, *, / and others)

(ii) see following (Julia 1.2 does have this property)

msbit(x::Float16) = reinterpret(UInt16, x) & 0x8000
isnan(x) && (msbit(x) === msbit(-x))

If this hold, one may determine if a calculation which results in a value that isnan has encountered an indeterminate expression (0.0/0.0, Inf/Inf, 0.0*Inf, Inf*0.0 and those variously signed) or has encountered a sentinel NaN or perhaps used an uninitialized value preset to NaN.

mschauer commented 5 years ago

So we have to fix Float16 and the others are fine? I am confused that that -x destroys information about what went wrong during computations

julia> reinterpret(Int, (0.0/0.0)) - reinterpret(Int, NaN)
-9223372036854775808

julia> reinterpret(Int, -(0.0/0.0)) - reinterpret(Int, NaN)
0

julia> reinterpret(Int, 0-(0.0/0.0)) - reinterpret(Int, NaN)
-9223372036854775808
JeffreySarnoff commented 5 years ago

The use of Float16 was as an example -- Float32 and Float64 need the same treatment.

The difficulty with -x is only solved by changing things as I describe in my previous comment.

JeffreySarnoff commented 5 years ago

also [Julia 1.2] does not do this, although the standard specifies the following behavior abs(x::T) where {T<:IEEEFloat} = isnan(x) ? x : flipsign(x, x) Along with the correction to -x described above, to get usable behavior, this is needed too.

What happens with copysign(NaN, x) is not specified in the standard. To provide propagated support that preserves each of these sorts of non-numbers,

immediateNaN = NaN
arithmeticNaN = 0 / 0

in addition to the handling of -x given above

abs(x::T) where {T<:IEEEFloat} = isnan(x) ? x : flipsign(x, x)
copysign(x::T, y::T) where {T<:IEEEFloat} = 
     (signbit(x) === signbit(y) || isnan(x)) ? x : flipsign(x, x)
JeffreySarnoff commented 5 years ago

"Implementations shall provide the following non-computational operations for all supported [floating point] arithmetic formats "

another missing mandated function is class(x):

class(x) tells which of the following ten classes x falls into: signalingNaN quietNaN negativeInfinity negativeNormal negativeSubnormal negativeZero positiveZero positiveSubnormal positiveNormal positiveInfinity

and we need totalorder(x::T, y::T) where {T<:IEEEFloat}

totalOrder(x, y) imposes a total ordering on canonical members of the format of x and y: a) If x<y, totalOrder(x, y) is true. b) If x>y, totalOrder(x, y) is false. c) If x=y: 1) totalOrder(−0, +0) is true. 2) totalOrder(+0, −0) is false. 3) If x and y represent the same floating-point datum: i) If x and y have negative sign, totalOrder(x, y) is true if and only if the exponent of x ≥ the exponent of y ii) otherwise, totalOrder(x, y) is true if and only if the exponent of x ≤ the exponent of y. d) If x and y are unordered numerically because x or y is a NaN: 1) totalOrder(−NaN, y) is true where −NaN represents a NaN with negative sign bit and y is a floating-point number. 2) totalOrder(x, −NaN) is false where −NaN represents a NaN with negative sign bit and x is a floating-point number. 3) totalOrder(x, +NaN) is true where +NaN represents a NaN with positive sign bit and x is a floating-point number. 4) totalOrder(+NaN, y) is false where +NaN represents a NaN with positive sign bit and y is a floating-point number. 5) If x and y are both NaNs, then totalOrder reflects a total ordering based on: i) negative sign orders below positive sign ii) signaling orders below quiet for +NaN, reverse for −NaN iii) otherwise, the order of NaNs is implementation-defined. Neither signaling NaNs nor quiet NaNs signal an exception. For canonical x and y, totalOrder(x, y) and totalOrder( y, x) are both true if x and y are bitwise identical. Unsigned NaNs, as may occur in non-interchange formats, should order like NaNs with positive sign bit

5) If x and y are both NaNs, then totalOrder reflects a total ordering based on: i) negative sign orders below positive sign ii) signaling orders below quiet for +NaN, reverse for −NaN iii) otherwise, the order of NaNs is implementation-defined.

regarding (5.iii), in this case our implementation should order according to the constructed "signed payload" values

nsajko commented 1 month ago

Should all of these go into Base? Would an interface package be more appropriate for some of the functions?

JeffreySarnoff commented 1 month ago

regarding negate(x::AbstractFloat) where AbstractFloat is IEEE 754-2019 conformant

section 5.5.1 Sign bit operations negate(x) copies a floating-point operand x to a destination in the same format, reversing the sign bit. negate(x) is not the same as subtraction(0, x).

note section 6.3 The sign bit

When either an input or result is a NaN, this standard does not interpret the sign of a NaN. However, operations on bit strings — copy, negate, abs, copySign — specify the sign bit of a NaN result, sometimes based upon the sign bit of a NaN operand. The logical predicates totalOrder and isSignMinus are also affected by the sign bit of a NaN operand. For all other operations, this standard does not specify the sign bit of a NaN result, even when there is only one input NaN, or when the NaN is produced from an invalid operation.

[however, I find this to be unhelpful :)]

JeffreySarnoff commented 1 month ago

Should all of these go into Base? Would an interface package be more appropriate for some of the functions?

Do you want to say Julia fully embraces IEEE 754-2019? If so, then they belong in Base. otoh "Language standards should define, to be implemented according to this subclause, as many of the operations in Table 9.1 as is appropriate to the language." [see first comment for a list]. From the perspective of the Standard, the question is not where to put them .. it is whether they are to be offered as standard part of Julia.