Closed samhatfield closed 8 years ago
It appears to be a problem with assigning out-of-bounds values. For example:
PROGRAM MAIN
USE rp_emulator
TYPE(rpe_var) :: x
x = HUGE(0.0)
WRITE(*,*) x
x = HUGE(0.0d0)
WRITE(*,*) x
END PROGRAM MAIN
Output:
-1 3.4028234663852886E+038
-1 Infinity
I assume that you use the default value RPE_DEFAULT_SBITS = 23 in your emulator. Therefore, the results of HUGE(x) will be represented with 23 bit significand and SBIT will also be set to 23. I guess that the value of "HUGE" of a double precision input is too big to be represented correctly when a 23 bit significand is used. Therefore the results is set to infinity.
The HUGE(0.0) value can, however, be represented correctly. Therefore, the value is not pushed to infinity.
This is the code of the emulator for the HUGE function for reference:
FUNCTION huge_rpe (a) RESULT (x) CLASS(rpe_type), INTENT(IN) :: a TYPE(rpe_var) :: x x%sbits = significand_bits(a) x = HUGE(a%get_value()) END FUNCTION huge_rpe
To this end, the emulator is doing what it is supposed to do. However, we may want to think about a smarter representation of the HUGE function, as the result will in most cases be replaced by infinity.
I admit that I am in a rush, so I hope this does all make sense.
Right, but still why does rpe_var
contain a double precision float, if it is supposed to represent a single precision float with an arbitrary significand?
I'm talking about this piece of code:
PUBLIC :: rpe_var
TYPE :: rpe_var
INTEGER :: sbits = RPE_SBITS_UNSPECIFIED
REAL(KIND=RPE_REAL_KIND) :: val
END TYPE
RPE_REAL_KIND
is a double, but the default significand is 23 bits, not 52?
True. We use a double precision value for val
to leave more flexibility for the user to consider precision levels between single and double precision. So we are actually representing a double precision exponent (except for the half precision case) with a flexible number of bits in the significand.
The default number of bits is a left-over from when we did use single precision values to represent reduced types, but it has no connection to the bit width of the number held in the rpe_var instance, it is just a parameter you can set to any integer you like between 0 and 52.
I agree that huge may need to be smarter. The simple wrapper is doing its job exactly as I expected, but probably not in a way that is useful. I'd be happy to merge a PR that implemented huge (and maybe another for tiny) that returned a value correct for the number of bits in the input.
A quick note: the implementation of huge
and tiny
should depend upon the exponent size as well as the number of bits in the significand. Since we don't emulate flexible exponents I'm not sure there is an obvious implementation at the current time. We could just use the double precision exponent size hard-coded into huge
and tiny
, and account only for variation in significand bits, which may be useful enough to bother doing it, I don't know...
We may also need to consider the behaviour of epsilon
along with this perhaps.
Minimal working example:
Output:
I don't know why this happens, because I don't quite follow the truncation procedure. I'm also not sure which value to expect; either the largest double or the largest single (incidentally,
rpe_var%val
is a double precision float, so why is the default number of significand bits 23, and not 52?).I also tried
TINY
, but that appears to be working fine. Am I right in thinking thatTINY(rpe_instance) == TINY(double_instance)
, because the smallest value is determined only by the exponent, which is the same for both data types?This all came about because I'm writing some reduced precision LAPACK routines. These use
TINY
andHUGE
to find the safest minimum value.Thanks in advance.