Perl / perl5

đŸȘ The Perl programming language
https://dev.perl.org/perl5/
Other
1.94k stars 554 forks source link

hexfp may lose 1-3 low order bits (most often, 1) #15033

Closed p5pRT closed 8 years ago

p5pRT commented 8 years ago

Migrated from rt.perl.org#126586 (status was 'open')

Searchable as RT126586$

p5pRT commented 8 years ago

From @jhi

5.22.0\, or bleadperl at 0c8adad7​:

$ ./perl -wle 'print 0xf.ffffffffffffp0' 16 $ ./perl -wle 'print 0xf.ffffffffffff8p0' Hexadecimal float​: mantissa overflow at -e line 1. 16 $ ./perl -wle 'print 0xf.ffffffffffffcp0' Hexadecimal float​: mantissa overflow at -e line 1. 16

The thing is\, the second one is not an overflow​: in the usual 53-bits-mantissa of IEEE doubles\, that's exactly 53 bits. The third one has 54\, so it correctly is an overflow. The first one is 52 bits\, so not an overflow.

I noticed this case while staring for at the code for perl #126582. So the last hexdigit with bits straddling the NV_MANT_DIG (often 53\, see above)\, may or may not lose some of its bits.

Note that this may be argued at least two ways​:

(1) if one specifies\, say\, 0x8\, as the last hex digit\, one may argue that one still specifies all the four bits\, not just the top one. (2) or one may argue that one only specified the top bit.

The code has so far been choosing the interpretation (1)\, but maybe the (2) makes more sense\, because otherwise one may never specify all the 53 bits using the hexdigits\, which is unfair the the very common case. (One could only specify up to 52 bits\, using 13 hexdigits.) And hexfp is all about being able to specify the fp down to the last bit.

p5pRT commented 8 years ago

From @jhi

http​://perl5.git.perl.org/perl.git/commit/96524c28e76cdc1099eebc2832aef5b789ebbc11

Smoked all over the place\, seems okay. (Testing more in double-double places would be nice.)

p5pRT commented 8 years ago

@jhi - Status changed from 'new' to 'resolved'

p5pRT commented 8 years ago

From @ap

* Jarkko Hietaniemi \perlbug\-followup@​perl\.org [2015-11-07 23​:45]​:

Note that this may be argued at least two ways​:

(1) if one specifies\, say\, 0x8\, as the last hex digit\, one may argue that one still specifies all the four bits\, not just the top one. (2) or one may argue that one only specified the top bit.

The code has so far been choosing the interpretation (1)\, but maybe the (2) makes more sense\, because otherwise one may never specify all the 53 bits using the hexdigits\, which is unfair the the very common case. (One could only specify up to 52 bits\, using 13 hexdigits.) And hexfp is all about being able to specify the fp down to the last bit.

To the extent that my mathematical reasoning will carry me\, there is no practical loss of precision from losing trailing zero bits
 *except* for the most significant zero bit. Correct?

If so\, then this is an irksome case\, since only the most significant bit from the last nibble can be retained in a 53-bit mantissa.

With the less significant bits all being 0\, if the most significant bit is also 0 then there is no loss of precision from dropping the trailing bits.

However\, if the most significant bit is 1 then losing the zero from the *second*-most significant bit *is* an actual loss of precision! :-(

So you can specify a 13th hex digit with no loss of precision\, but only if that digit is 0. That’s
 not terribly useful. It’s unfortunate that this fact should make it impossible to specify the LSB in a 4x+1 bits mantissa as 1.

These are options I can think of​:

1. Downgrade this case to a warning. Namely\, merely warn about an extra   nibble if mantissa length is not a multiple of 4\, but all of the bits   being lost are 0. Possibly even accept dropped 0 bits silently if the   mantissa LSB is 0.

  I.e. a 53-bit mantissa 0000000000000 would be silent\, 0000000000008   would warn\, and 000000000000C would error.

  I guess the warning would have to be in a special category of its own   (`hexfpmantlsb`?) so that you could turn it off specifically. But it   would be annoying to be forced to turn off the warning *every time*   you want to specify a float down to the 53rd mantissa bit.

2. Add notation to specify the number of trailing zero bits that should   be considered unspecified instead of explicitly zero.

  This would be kinda cryptic and doesn’t seem to have a real upside in   explicitness. Intuitively I thought it should\, but it doesn’t because   the float literal doesn’t change length on different systems. A float   literal with 13 hex digits is an error on a system with 52 or fewer   bits of mantissa. And on a system with 56 or more bits of mantissa it   can be represented exactly. So more precision than you intended can   only be inadvertently lost when going from a system with e.g. 55 bits   of mantissa to one with 53 bits​: both would require hex floats with   the same number of digits to specify all their mantissa bits but they   differ in how many bits from the last hex digit they can retain. Such   minute differences in precision do not exist in practice.

Any others?

Regards\, -- Aristotle Pagaltzis // \<http​://plasmasturm.org/>

p5pRT commented 8 years ago

From @jhi

Re-opening\, since this case is not clear-cut\, as pointed out.

So the commit I made makes the straddling case to warn\, *if* they have one-bits beyond the 53th (if speaking about the common IEEE double case). But if there's just the right number of one-bits in the straddling nibble\, no warning. (Nothing is erroring.)

But if one then continues to add more nibbles/hexdigits\, of any content\, the warning will again happen.

I think at least for 5.22.1 (*) the commit I made enhances the situation\, since starting warning after the 13th nibble is just silly\, the change effectively moved the warning one bit further.

FWIW\, I think all this complication argues for officially (for 5.23/24) allowing the accidentally working "binfp"\, for allowing just the right number of bits. (The "octfp" is another matter\, 3 bits presents even worse confusion than 4 bits.)

(*) Steve Hay told me that since these changes are fixing a new feature\, they are worthy. (Especially the earlier perl #126582.)

p5pRT commented 8 years ago

@jhi - Status changed from 'resolved' to 'open'

p5pRT commented 8 years ago

From @jhi

With the less significant bits all being 0\, if the most significant bit is also 0 then there is no loss of precision from dropping the trailing bits.

However\, if the most significant bit is 1 then losing the zero from the *second*-most significant bit *is* an actual loss of precision! :-(

So you can specify a 13th hex digit with no loss of precision\, but only if that digit is 0. That’s
 not terribly useful. It’s unfortunate that this fact should make it impossible to specify the LSB in a 4x+1 bits mantissa as 1.

Regarding this​: isn't this about whether the conversion is truncating or rounding\, and rounding which way? Now it's strictly truncating.

p5pRT commented 8 years ago

From @arc

Aristotle Pagaltzis \pagaltzis@&#8203;gmx\.de wrote​:

So you can specify a 13th hex digit with no loss of precision\, but only if that digit is 0. That’s
 not terribly useful. It’s unfortunate that this fact should make it impossible to specify the LSB in a 4x+1 bits mantissa as 1.

These are options I can think of​:

1. Downgrade this case to a warning. Namely\, merely warn about an extra nibble if mantissa length is not a multiple of 4\, but all of the bits being lost are 0. Possibly even accept dropped 0 bits silently if the mantissa LSB is 0.

2. Add notation to specify the number of trailing zero bits that should be considered unspecified instead of explicitly zero.

Any others?

Does this work?

3. Treat all bits of all hex digits at the least significant end as part of the user's intention\, including trailing zero bits; but ignore zero bits at the high end. Then specifying all 53 significand bits of an IEEE-ish double could be done by always having a 1 as the leftmost hex digit​:

0x1.123456789abcdP0 # no warning; rightmost bit == 1 0x11.23456789abcdP-4 # same 0x2.123456789abcdP0 # "mantissa overflow"

That is​: the first two examples specify exactly 53 bits after the leftmost zeros\, so they're fine; but the third tries to specify 54 bits after the leftmost zeros\, so it overflows.

This seems consistent with the way we treat high zero bits in octal integer literals​:

$ perl -wE '$_ = "01" . "0" x 20 . "7"; say for $_\, sprintf "%#o"\, eval' Octal number > 037777777777 non-portable at (eval 1) line 1. 01000000000000000000007 01000000000000000000007 $ perl -wE '$_ = "02" . "0" x 20 . "7"; say for $_\, sprintf "%#o"\, eval' Integer overflow in octal number at (eval 1) line 1. Octal number > 037777777777 non-portable at (eval 1) line 1. 02000000000000000000007 01777777777777777777777

That is​: in the first example there\, we have 3 bits in the trailing "7"\, a further 20*3 == 60 bits in the middle zeros\, and one bit after the leftmost zeros in the leading "1"\, for a total of exactly 64 bits\, yielding no overflow on this 64-bit system. In the second example\, we have the same 63 bits starting at the zeros\, and *two* bits after the leftmost zero in the leading "2"\, for a total of 65 bits; so we get an overflow.

(Hex integer literals seem to work the same way too\, but they don't have any leftover bits on 32- or 64-bit platforms\, so there's no partial-digit problem to deal with there.)

-- Aaron Crane ** http​://aaroncrane.co.uk/

p5pRT commented 8 years ago

From @ap

* Jarkko Hietaniemi via RT \perlbug\-followup@&#8203;perl\.org [2015-11-08 15​:35]​:

Isn't this about whether the conversion is truncating or rounding\, and rounding which way?

It was only about me conflating things. But my confusion eludes me yet\, so please just disregard everything I said.