Maximum/Minimum of Two Real Numbers

The minimum of two real numbers is currently very elegantly coded as

inline proc min(x: real(?w), y: real(w)) return if (x < y) | isnan(x) then x else y;

The following might be better (and that is MIGHT)

inline proc min(x: real(?w), y: real(w)) return if (x < y) | (x != x) then x else y;

as it is technically more correct and more elegant. The only reason that the isnan function was introduced into the IEEE 754 interface is that some overly aggressive optimizing compilers (mainly in the past) have deluded themselves that the statement

x != x

is always false when it is actually true (by definition) when the variable x is a Not-a-Number or NaN in IEEE 754 speak.

The Chapel compiler is certainly not one of delusional compilers when it comes to floating point. It understands the concept properly and handles the statement correctly.

On some hardware, it may be slightly slower, on some slightly faster. On my limited testing on current Intel, it is much the same. On older Intel it was slightly slower as it demanded an extra memory access. I will try and do some testing on ARM later and might test some equivalernt C on IBM Power 8 unless someone tells me that Chapel will compile on that architecture.

And there is a similar case for the maximum.

I am curious what makes more sense in the long term.

With the 2019 IEEE 754 standard mandating several minimum/maximum routines, some based on magnitude alone, some handling NaN properly and some, in the name of backwards compatability, handling it wrongly. This programming style issue should at least be bounced around. Just a thought.

Given two real(?w) arguments, the 2019 IEEE 754 standard mandates that there be minimum & maximum operations which:

in the event that either (but not both) of these are a NaN, it returns the NaN
in the event that either (but not both) of these are a NaN, it return the argument which is a number
in the event that neither are a NaN, it returns the minimum (or maximum) as appropriate
in the event that bpth are a NaN, it return one of those
return variants of the above which work with the magnitudes of those two arguments

Here is a first cut written in a style which is optimal if this code was written in a particular style.. I was hoping to be able to also include some insightful comments by considering the quality of the assembler from the Chapel compiler (as I have already dug into that in detail with C equivalents). But my attempts at looking at the Chapel-generated assembler are not going too well and this issue is going a bit stagnant. So, I will just put the code up and I will follow up with some comments in a few days.

// IEEE 754 mandated minimum of two real(?w) quantities

inline proc min(x : real(?w), y : real(w))
    return if (x != x) | (x < y) then x else y;

// as per the above but using the isnan() routine

inline proc minNaN(x : real(?w), y : real(w))
    return if isnan(x) | (x < y) then x else y;

// IEEE 754 mandated minimum of the magnitudes

inline proc minMag(x : real(?w), y : real(w))
    return min(abs(x), abs(y));

// IEEE 754 mandated minimum except when one is an NaN, return the Number!

inline proc minNum(x : real(?w), y : real(w))
    return if (y != y) | (x < y) then x else y;

// IEEE 754 mandated minimum of the magnitudes but using the logic of 'minNum'

inline proc minMagNum(x : real(?w), y : real(w))
    return minNum(abs(x), abs(y));

// Quick-and-Dirty minimum (non IEEE 754 compliant)

inline proc minRaw(x : real(?w), y : real(w))
    return if x < y then x else y;

// Quick-and-Dirty minimum of the magnitudes (non IEEE 754 compliant)

inline proc minMagRaw(x : real(?w), y : real(w))
    return minNum(abs(x), abs(y));

proc main
{
    const x = -10.5, y = 100.5, z = (x - x)/(x - x);

    assert(isnan(z));

    writeln("NaN ", minNaN(x, y));
    writeln("Raw ", min(x, y));
    writeln("Mag ", minMag(x, y));
    writeln("Raw-Num ", minNum(x, y));
    writeln("Mag-Num ", minMagNum(x, y));
    writeln("----------------");

    writeln("NaN ", minNaN(x, z));
    writeln("Raw ", min(x, z));
    writeln("Mag ", minMag(x, z));
    writeln("Raw-Num ", minNum(x, z));
    writeln("Mag-Num ", minMagNum(x, z));
    writeln("----------------");

    writeln("NaN ", minNaN(z, y));
    writeln("Raw ", min(z, y));
    writeln("Mag ", minMag(z, y));
    writeln("Raw-Num ", minNum(z, y));
    writeln("Mag-Num ", minMagNum(z, y));
}

Feel free to comment.

All these routines in inline. I cannot see the point of non-inline variants but feel free to tell me why I am wrong.

The original code in Math.chpl has the test for the argument being a Not a Number as the second element of the OR.

Putting it as the first in a C translation resulted in the same or less assembler instructions coming out of GCC and CLANG on an X86-64 and out of GCC on an IBM Power 8.

The ordering did not seem to matter much to Chapel although avoiding isnan() in Chapel is 10% faster.

The performance of C and Chapel using the x != x construct is on a par.

More to follow (when I wrap my head around Chapel assembler output).

Please comment on the below.

// IEEE 754 mandated maximum of two real(?w) arguments, propagating NaNs

inline proc max(x : real(?w), y : real(w))
    return if (x != x) | (x < y) then x else y;

// as immediately above except when one argument is an NaN, return the Number!

inline proc maxNum(x : real(?w), y : real(w))
    return if (y != y) | (x < y) then x else y;

// fast but non IEEE 754 compliant maximum of two real(?w) arguments

inline proc maxRaw(x : real(?w), y : real(w))
    return if x < y then x else y;

// IEEE 754 mandated minimum of two real(?w) arguments, propagating NaNs

inline proc min(x : real(?w), y : real(w))
    return if (x != x) | (x < y) then x else y;

// as immediately above except when one argument is an NaN, return the Number!

inline proc minNum(x : real(?w), y : real(w))
    return if (y != y) | (x < y) then x else y;

// fast but non IEEE 754 compliant minimum of two real(?w) arguments

inline proc minRaw(x : real(?w), y : real(w))
    return if x < y then x else y;

// IEEE 754  maxima/minima of the magnitudes associated with the above

inline proc maxMag(x : real(?w), y : real(w)) return max(abs(x), abs(y));

inline proc maxMagNum(x : real(?w), y : real(w)) return maxNum(abs(x), abs(y));

inline proc maxMagRaw(x : real(?w), y : real(w)) return maxRaw(abs(x), abs(y));

inline proc minMag(x : real(?w), y : real(w)) return min(abs(x), abs(y));

inline proc minMagNum(x : real(?w), y : real(w)) return minNum(abs(x), abs(y));

inline proc minMagRaw(x : real(?w), y : real(w)) return minRaw(abs(x), abs(y));

// These next two routines are NOT proposed to be included

// like 'min' above but using the isnan() routine - only for testing

inline proc minNaN(x : real(?w), y : real(w))
    return if isnan(x) | (x < y) then x else y;

// like 'max' above but using the isnan() routine - only for testing

inline proc maxNaN(x : real(?w), y : real(w))
    return if isnan(x) | (x < y) then x else y;

When I did a direct translation into C on an X86-64, with GCC I saw what I would consider minimalist code which is good for inline routines. With CLANG, it was less so but was branchless (which is a good thing) so having a few more instruction is probably tolerable.

Note that an ARM processor has machine instructions for all of the routines excluding those involving only magnitude. I would be curious at how well the above optimizes in this case. Maybe you need to use embedded C code in the Math.chpl include file, where that C code itself used embedded assembler. On an ARM, the concept of the above 'raw' operation is redundant!!

The above do NOT necessarily propagate the sign of a NaN, but then again, discussions in IEEE 754 about NaN propagation say nothing about the sign. Mind you, IEEE 754 says very little about the payload either.

I have avoided the question of how a min (or max) reduction behaves in the presence of a NaN. I do not know enough to comment there. And then there is the question of what happens under a vector operation, a whole new can of worms that is very hardware dependent.

Lots of food for thought.

As written above, max and min may raise a unnecessary INVALID OPERATION floating point exception, while maxNum and minNum may not raise a necessary INVALID OPERATION, both in the name of speed. Unsure how best to fix that.

chapel-lang / chapel

Maximum/Minimum of Two Real Numbers #20390