Closed p5pRT closed 9 years ago
[resubmitting since I think the grues ate my first attempt]
Perl could support hexadecimal floats:
* literals: 0xh.hhhp[+-]?NNN\, e.g. 0x1.47ae147ae147bp-7 is 0.1 * printf %a %A * input (PV->NV): "0xh.hhhpnnn" + 3
Lack of %a noted by Dan Kogai: https://groups.google.com/d/msg/perl.perl5.porters/c84JU0olnbQ/YwQczyrqE2YJ Pointer given by Dan: http://en.wikipedia.org/wiki/Printf_format_string#Type
Possibly useful resource: http://www.exploringbinary.com/hexadecimal-floating-point-constants/ found by quick googling.
Ruby does support the %a %A as noted by Dan\, and Python has float.hex() and float.fromhex().
* literals: 0xh.hhhp[+-]?NNN\, e.g. 0x1.47ae147ae147bp-7 is 0.1
Oops\, 0.01
On Wed\, Jul 02\, 2014 at 07:49:46PM -0700\, Jarkko Hietaniemi wrote:
Perl could support hexadecimal floats:
* literals: 0xh.hhhp[+-]?NNN\, e.g. 0x1.47ae147ae147bp-7 is 0.1
Wouldn't that change the meaning of existing legal syntax: e.g.
print 0x1.10;
which currently prints "110"\, but would change to print "1.0625"
-- No matter how many dust sheets you use\, you will get paint on the carpet.
The RT System itself - Status changed from 'new' to 'open'
Dave Mitchell \davem@​iabyn\.com writes:
On Wed\, Jul 02\, 2014 at 07:49:46PM -0700\, Jarkko Hietaniemi wrote:
Perl could support hexadecimal floats:
* literals: 0xh.hhhp[+-]?NNN\, e.g. 0x1.47ae147ae147bp-7 is 0.1 ^^^^^^^^^ ^^^
Wouldn't that change the meaning of existing legal syntax: e.g.
print 0x1\.10;
which currently prints "110"\, but would change to print "1.0625"
$ perl -e 'print 0x1.10p+0' Bareword found where operator expected at -e line 1\, near "10p" (Missing operator before p?) syntax error at -e line 1\, near "10p " Execution of -e aborted due to compilation errors.
-- "A disappointingly low fraction of the human race is\, at any given time\, on fire." - Stig Sandbeck Mathisen
On Thursday-201407-03\, 8:26\, Dagfinn Ilmari Mannsåker via RT wrote:
Dave Mitchell \davem@​iabyn\.com writes:
On Wed\, Jul 02\, 2014 at 07:49:46PM -0700\, Jarkko Hietaniemi wrote:
Perl could support hexadecimal floats:
* literals: 0xh.hhhp[+-]?NNN\, e.g. 0x1.47ae147ae147bp-7 is 0.1 ^^^^^^^^^ ^^^
Wouldn't that change the meaning of existing legal syntax: e.g.
print 0x1\.10;
which currently prints "110"\, but would change to print "1.0625"
$ perl \-e 'print 0x1\.10p\+0' Bareword found where operator expected at \-e line 1\, near "10p" \(Missing operator before p?\) syntax error at \-e line 1\, near "10p " Execution of \-e aborted due to compilation errors\.
Yeah\, I think the 'p' (hmm\, is that 'P' with %A?) is a mandatory part of the package.
On Thu\, Jul 3\, 2014 at 2:34 PM\, Jarkko Hietaniemi \jhi@​iki\.fi wrote:
On Thursday-201407-03\, 8:26\, Dagfinn Ilmari Mannsåker via RT wrote:
Dave Mitchell \davem@​iabyn\.com writes:
On Wed\, Jul 02\, 2014 at 07:49:46PM -0700\, Jarkko Hietaniemi wrote:
Perl could support hexadecimal floats:
* literals: 0xh.hhhp[+-]?NNN\, e.g. 0x1.47ae147ae147bp-7 is 0.1
^^^^^^^^^ ^^^
Wouldn't that change the meaning of existing legal syntax: e.g.
print 0x1\.10;
which currently prints "110"\, but would change to print "1.0625"
$ perl \-e 'print 0x1\.10p\+0' Bareword found where operator expected at \-e line 1\, near "10p" \(Missing operator before p?\) syntax error at \-e line 1\, near "10p " Execution of \-e aborted due to compilation errors\.
Yeah\, I think the 'p' (hmm\, is that 'P' with %A?) is a mandatory part of the package.
sub deadbeefp () {3} 0x1.deadbeefp+0
Personally\, I think adding the construct + a deprecation warning for pathological cases is a good enough (tm) tradeoff.
On Thu\, Jul 03\, 2014 at 01:25:54PM +0100\, Dagfinn Ilmari Mannsåker wrote:
Dave Mitchell \davem@​iabyn\.com writes:
On Wed\, Jul 02\, 2014 at 07:49:46PM -0700\, Jarkko Hietaniemi wrote:
Perl could support hexadecimal floats:
* literals: 0xh.hhhp[+-]?NNN\, e.g. 0x1.47ae147ae147bp-7 is 0.1 ^^^^^^^^^ ^^^
Wouldn't that change the meaning of existing legal syntax: e.g.
print 0x1\.10;
which currently prints "110"\, but would change to print "1.0625"
$ perl \-e 'print 0x1\.10p\+0' Bareword found where operator expected at \-e line 1\, near "10p" \(Missing operator before p?\) syntax error at \-e line 1\, near "10p " Execution of \-e aborted due to compilation errors\.
Ah sorry\, didn't spot the p.
-- You're only as old as you look.
Yeah\, I think the 'p' (hmm\, is that 'P' with %A?) is a mandatory part of the package.
sub deadbeefp () {3} 0x1.deadbeefp+0
You have a twisted mind\, and this is a compliment.
Personally\, I think adding the construct + a deprecation warning for pathological cases is a good enough (tm) tradeoff.
Based on http://grep.cpan.me/?q=0x%5B0-9a-f%5D%2B%5C.%5B0-9a-f%5D%2Bp%5B%2B-%5D%5Cd%2B (that's /0x[0-9a-f]+\.[0-9a-f]+p[+-]\d+/) I wouldn't bother even with a warning. (All the hits seem to be to modules which already somehow try to handle this currently non-native format.)
So I did some hacking to get this working for at least *printf and literals\, and two patches are attached. I cheated and just punted to using sprintf/strtod.
However: the "hexadecimal floats" support seems to be quite... interesting. As in "interesting times" interesting.
So it's a C99 feature. Output with sprintf %a %A\, input with strtod (or strtold). In theory.
The attached patches (and their tests) work with:
OSX x86 Linux x86 Linux x86 -Duselongdouble
(I *think* the output side at least did work in win32\, but the win32 smoker must be overwhelmed or something\, I seem to get no results)
But cracks start to appear...
OS X x86 with -Duselongdouble has differences in the *printf output Solaris x86 fails completely on input (as if strtod would not parse hexfloats at all\, haven't dug into it)
On the output side differences are easy since we are talking about floats: the exponent may float. 0x1.999999999999ap-4 is 0xc.ccccccccccccccdp-7 (Linux "normal" doubles vs "long doubles")
But even what the basic %a means seems to be up to interpretation: not ok 1420 - '%a' '1' -> '0x1.0000000000000p+0' cf '0x1p+0' (Solaris)
But if strtod is not working\, I don't feel like rewriting David Gay's dtoa.c (which is the canonical strtod source for many operating systems\, like BSD\, or other OSS projects use): http://www.netlib.org/fp/dtoa.c
If output is not working (or needs to be standardized)\, we need to dig into the fp bits ourselves. I found this from the NetBSD: https://github.com/rumpkernel/netbsd-userspace-src/blob/master/lib/libc/gdtoa/hdtoa.c
So I did some hacking to get this working for at least *printf and literals\, and two patches are attached. I cheated and just punted to using sprintf/strtod.
However: the "hexadecimal floats" support seems to be quite... interesting. As in "interesting times" interesting.
So it's a C99 feature. Output with sprintf %a %A\, input with strtod (or strtold). In theory.
The attached patches (and their tests) work with:
OSX x86 Linux x86 Linux x86 -Duselongdouble
(I *think* the output side at least did work in win32\, but the win32 smoker must be overwhelmed or something\, I seem to get no results)
But cracks start to appear...
OS X x86 with -Duselongdouble has differences in the *printf output Solaris x86 fails completely on input (as if strtod would not parse hexfloats at all\, haven't dug into it)
On the output side differences are easy since we are talking about floats: the exponent may float. 0x1.999999999999ap-4 is 0xc.ccccccccccccccdp-7 (Linux "normal" doubles vs "long doubles")
But even what the basic %a means seems to be up to interpretation: not ok 1420 - '%a' '1' -> '0x1.0000000000000p+0' cf '0x1p+0' (Solaris)
But if strtod is not working\, I don't feel like rewriting David Gay's dtoa.c (which is the canonical strtod source for many operating systems\, like BSD\, or other OSS projects use): http://www.netlib.org/fp/dtoa.c
If output is not working (or needs to be standardized)\, we need to dig into the fp bits ourselves. I found this from the NetBSD: https://github.com/rumpkernel/netbsd-userspace-src/blob/master/lib/libc/gdtoa/hdtoa.c
Jarkko Hietaniemi via RT \perlbug\-comment@​perl\.org wrote:
So I did some hacking to get this working for at least *printf and literals\, and two patches are attached.
Excellent — thanks!
I cheated and just punted to using sprintf/strtod. (I *think* the output side at least did work in win32\, but the win32 smoker must be overwhelmed or something\, I seem to get no results)
According to this page:
http://msdn.microsoft.com/en-us/library/hf4y5e3w(v=vs.71).aspx
the compiler in Visual Studio 2003 doesn't support %a formats in printf. AIUI\, we aim to support VC6\, which I assume also doesn't support %a. So I think punting to sprintf/strtod for hex-float support\, while admirably tempting from a laziness point of view\, may not be a viable approach\, at least on win32.
Corrections welcome from anyone who knows anything about win32.
On the output side differences are easy since we are talking about floats: the exponent may float. 0x1.999999999999ap-4 is 0xc.ccccccccccccccdp-7 (Linux "normal" doubles vs "long doubles")
I think that's not terribly unreasonable. An IEEE double has 53 bits of significand\, which can be emitted with a single bit (whose value is 1 except in denormals) before the hexadecimal point\, and thirteen hex digits (four bits apiece) after it. An x86 long double\, on the other hand\, has 63 bits of significand\, so emitting 3 bits before the point and 15 nybbles after it seems straightforward.
But I take your point that it's somewhat vexing for these purposes.
But even what the basic %a means seems to be up to interpretation: not ok 1420 - '%a' '1' -> '0x1.0000000000000p+0' cf '0x1p+0' (Solaris)
That's undeniably a fairly cruddy %a implementation (in the sense that if you wanted all those extra digits you'd surely ask for them) but it's not actually *wrong*. Which is\, yes\, also vexing for our purposes.
But if strtod is not working\, I don't feel like rewriting David Gay's dtoa.c (which is the canonical strtod source for many operating systems\, like BSD\, or other OSS projects use): http://www.netlib.org/fp/dtoa.c
If output is not working (or needs to be standardized)\, we need to dig into the fp bits ourselves. I found this from the NetBSD: https://github.com/rumpkernel/netbsd-userspace-src/blob/master/lib/libc/gdtoa/hdtoa.c
As far as I know\, it's possible to implement hex float I/O without bit-banging as long as you've got ldexp\, frexp\, isnormal\, isnan\, and isinf. But I doubt very much whether those can reliably be found on older systems that lack hex-float support in strtod and %a in sprintf. :-(
What would happen if we borrowed one of the other implementations wholesale? Are there any licensing issues getting in the way?
-- Aaron Crane ** http://aaroncrane.co.uk/
0x1.999999999999ap-4 is 0xc.ccccccccccccccdp-7 (Linux "normal" doubles vs "long doubles")
I think that's not terribly unreasonable. An IEEE double has 53 bits of significand\, which can be emitted with a single bit (whose value is 1 except in denormals) before the hexadecimal point\, and thirteen hex digits (four bits apiece) after it. An x86 long double\, on the other hand\, has 63 bits of significand\, so emitting 3 bits before the point and 15 nybbles after it seems straightforward.
I should have included more examples\, I think Solaris provided those... it's not just due to long doubles. I don't have a C99 spec in front of me\, but I doubt how well defined the format it is...
But even what the basic %a means seems to be up to interpretation: not ok 1420 - '%a' '1' -> '0x1.0000000000000p+0' cf '0x1p+0' (Solaris)
That's undeniably a fairly cruddy %a implementation (in the sense that if you wanted all those extra digits you'd surely ask for them) but it's not actually *wrong*. Which is\, yes\, also vexing for our purposes.
For example: what is the '%a' supposed to "optimize for"? As few hexdigits before the "." as possible? Maximize the exponent? Minimize it? Steer it towards the closest/lowest/highest exponent divisible by four? By eight?
As far as I know\, it's possible to implement hex float I/O without bit-banging as long as you've got ldexp\, frexp\, isnormal\, isnan\, and isinf. But I doubt very much whether those can reliably be found on older systems that lack hex-float support in strtod and %a in sprintf. :-(
Indeed.
(Which reminds me that our inf/nan support is still a bit dubious.)
What would happen if we borrowed one of the other implementations wholesale? Are there any licensing issues getting in the way?
BSD licensed code is no problem\, we have historically borrowed used that... mergesort\, for example. drand48_r.
For the netlib code\, somebody with legal chops would have to take a look for compatibility with Artistic/GPL. Not that I expect any problems\, since e.g. Python includes it.
Solaris x86 fails completely on input (as if strtod would not parse hexfloats at all\, haven't dug into it)
Now did. Ugh.
In Solaris 10\, strtod must be in "c99 mode" for the hexfloats to be recognized. (strtold is always in this mode). The "c99 mode' is achieved by using "c99" as the Solaris Studio compiler (driver)\, instead of "cc".
In Solaris 9 (or earlier)\, there is no support for hexfloats. (Not blaming Solaris in particular: I'm pretty certain many older OS releases will be similarly C99-unsupportive.)
If one is not using Solaris Studio cc (something beginning with g\, maybe)\, one can live dangerously and explicitly link in either of /usr/lib/{32\,64}/values-xpg6.o and get the "c99 strtod". Dangerous living because probably many other things get "upgraded"\, too.
Executive summary: using the netlib dtoa.c (*) is starting to sound even more siren-like.
(*) an odd name\, given that it's strtod implementation...
Solaris x86 fails completely on input (as if strtod would not parse hexfloats at all\, haven't dug into it)
Now did. Ugh.
In Solaris 10\, strtod must be in "c99 mode" for the hexfloats to be recognized. (strtold is always in this mode). The "c99 mode' is achieved by using "c99" as the Solaris Studio compiler (driver)\, instead of "cc".
In Solaris 9 (or earlier)\, there is no support for hexfloats. (Not blaming Solaris in particular: I'm pretty certain many older OS releases will be similarly C99-unsupportive.)
If one is not using Solaris Studio cc (something beginning with g\, maybe)\, one can live dangerously and explicitly link in either of /usr/lib/{32\,64}/values-xpg6.o and get the "c99 strtod". Dangerous living because probably many other things get "upgraded"\, too.
Executive summary: using the netlib dtoa.c (*) is starting to sound even more siren-like.
(*) an odd name\, given that it's strtod implementation...
[dtoa.c] an odd name\, given that it's strtod implementation...
Good news\, everyone... the netlib dtoa.c contains *both* strtod() and dtoa()\, the latter useable for sprintfing.
It is quite widely used: Python\, PHP\, and *Java*; and Chrome\, Firefox\, and Safari.
More useful reading: http://www.exploringbinary.com/how-strtod-works-and-sometimes-doesnt/ (note that this article is 2 years old\, the bugs referred to have been corrected)
[dtoa.c] an odd name\, given that it's strtod implementation...
Good news\, everyone... the netlib dtoa.c contains *both* strtod() and dtoa()\, the latter useable for sprintfing.
It is quite widely used: Python\, PHP\, and *Java*; and Chrome\, Firefox\, and Safari.
More useful reading: http://www.exploringbinary.com/how-strtod-works-and-sometimes-doesnt/ (note that this article is 2 years old\, the bugs referred to have been corrected)
From https://rt-archive.perl.org/perl5/Ticket/Display.html?id=122482:
And since we are not really depending on the system strtod:s anyway (except for nan/inf)\, it looks like for the hexadecimal fp "strtod-ing" it would be better just to implement our own. This would not\, however\, solve the hexadecimal fp output.
On the hexadecimal output the killer wording in the C99 seems to be that trailing zeros *may* be printed. And this is what Solaris does\, but glibc (Linux)\, and whatever is used in OS X\, do not.
From https://rt-archive.perl.org/perl5/Ticket/Display.html?id=122482:
And since we are not really depending on the system strtod:s anyway (except for nan/inf)\, it looks like for the hexadecimal fp "strtod-ing" it would be better just to implement our own. This would not\, however\, solve the hexadecimal fp output.
On the hexadecimal output the killer wording in the C99 seems to be that trailing zeros *may* be printed. And this is what Solaris does\, but glibc (Linux)\, and whatever is used in OS X\, do not.
On Thu Jul 03 08:18:01 2014\, jhi wrote:
Yeah\, I think the 'p' (hmm\, is that 'P' with %A?) is a mandatory part of the package.
sub deadbeefp () {3} 0x1.deadbeefp+0
You have a twisted mind\, and this is a compliment.
Personally\, I think adding the construct + a deprecation warning for pathological cases is a good enough (tm) tradeoff.
Based on http://grep.cpan.me/?q=0x%5B0-9a-f%5D%2B%5C.%5B0-9a-f%5D%2Bp%5B%2B- %5D%5Cd%2B (that's /0x[0-9a-f]+\.[0-9a-f]+p[+-]\d+/) I wouldn't bother even with a warning. (All the hits seem to be to modules which already somehow try to handle this currently non-native format.)
This came up on the list a couple of years ago. At the time I think the consensus was to allow parser plugins to extend the syntax\, instead of hard-coding one of them into toke.c.
When we first tried to reserve this syntax (or something similar) by deprecating 0xf00 followed by a dot\, several cases showed up in the perl tests themselves. I think they got changed\, masking the fact that such syntax already occurs in real life.
Now this is all from memory without actually looking anything up....
--
Father Chrysostomos
This came up on the list a couple of years ago. At the time I think the consensus was to allow parser plugins to extend the syntax\, instead of hard-coding one of them into toke.c.
Having looked at the toke.c now for a while\, I think the plugin plan is wishful thinking unless something drastic happens first.
When we first tried to reserve this syntax (or something similar) by deprecating 0xf00 followed by a dot\, several cases showed up in the perl tests themselves. I think they got changed\, masking the fact that such syntax already occurs in real life.
I would find that surprising... the "pEXPONENT" part is currently syntax error.
This came up on the list a couple of years ago. At the time I think the consensus was to allow parser plugins to extend the syntax\, instead of hard-coding one of them into toke.c.
Having looked at the toke.c now for a while\, I think the plugin plan is wishful thinking unless something drastic happens first.
When we first tried to reserve this syntax (or something similar) by deprecating 0xf00 followed by a dot\, several cases showed up in the perl tests themselves. I think they got changed\, masking the fact that such syntax already occurs in real life.
I would find that surprising... the "pEXPONENT" part is currently syntax error.
For better or worse\, I have now submitted
http://perl5.git.perl.org/perl.git/commit/dc91db6 Configure scan for the kind of long double we have http://perl5.git.perl.org/perl.git/commit/688e39e5 Configure scan for ldexpl http://perl5.git.perl.org/perl.git/commit/98181445 Perl_ldexp is one of ldexpl\, scalbnl\, or ldexp http://perl5.git.perl.org/perl.git/commit/40bca5ae9 Hexadecimal float sprintf http://perl5.git.perl.org/perl.git/commit/61e61fbc Hexadecimal float literals
which implement hexadecimal floats\, without depending on C99 or using system printf/strtod.
The dc91db6 will probably contain many bad guesses for the non-Configure platforms. Only smokes will tell.
For better or worse\, I have now submitted
http://perl5.git.perl.org/perl.git/commit/dc91db6 Configure scan for the kind of long double we have http://perl5.git.perl.org/perl.git/commit/688e39e5 Configure scan for ldexpl http://perl5.git.perl.org/perl.git/commit/98181445 Perl_ldexp is one of ldexpl\, scalbnl\, or ldexp http://perl5.git.perl.org/perl.git/commit/40bca5ae9 Hexadecimal float sprintf http://perl5.git.perl.org/perl.git/commit/61e61fbc Hexadecimal float literals
which implement hexadecimal floats\, without depending on C99 or using system printf/strtod.
The dc91db6 will probably contain many bad guesses for the non-Configure platforms. Only smokes will tell.
On Thu\, Aug 14\, 2014 at 6:56 AM\, Jarkko Hietaniemi via RT \perlbug\-comment@​perl\.org wrote:
For better or worse\, I have now submitted
http://perl5.git.perl.org/perl.git/commit/dc91db6 Configure scan for the kind of long double we have http://perl5.git.perl.org/perl.git/commit/688e39e5 Configure scan for ldexpl http://perl5.git.perl.org/perl.git/commit/98181445 Perl_ldexp is one of ldexpl\, scalbnl\, or ldexp http://perl5.git.perl.org/perl.git/commit/40bca5ae9 Hexadecimal float sprintf http://perl5.git.perl.org/perl.git/commit/61e61fbc Hexadecimal float literals
which implement hexadecimal floats\, without depending on C99 or using system printf/strtod.
The dc91db6 will probably contain many bad guesses for the non-Configure platforms. Only smokes will tell.
Would there be any advantage in toke.c to using Uquad_t or U64TYPE (where available) rather than UV for the chunk that holds the mantissa? The size chosen for Perl's integers don't necessarily reflect what's available on the platform?
On Thursday-201408-14\, 8:51\, Craig A. Berry wrote:
The dc91db6 will probably contain many bad guesses for the non-Configure platforms. Only smokes will tell.
Would there be any advantage in toke.c to using Uquad_t or U64TYPE (where available) rather than UV for the chunk that holds the mantissa? The size chosen for Perl's integers don't necessarily reflect what's available on the platform?
Ah\, good point. As a matter of fact\, I use that very fact in sv.c already (look for MANTISSATYPE). I'll take a look in a couple of days once we see how widespread damage this first batch caused.
(I also need to think more carefully what happens/should happen at floating point "extremities" like Inf and Nan.)
Jarkko Hietaniemi via RT \perlbug\-comment@​perl\.org wrote:
For better or worse\, I have now submitted
http://perl5.git.perl.org/perl.git/commit/dc91db6 Configure scan for the kind of long double we have http://perl5.git.perl.org/perl.git/commit/688e39e5 Configure scan for ldexpl http://perl5.git.perl.org/perl.git/commit/98181445 Perl_ldexp is one of ldexpl\, scalbnl\, or ldexp http://perl5.git.perl.org/perl.git/commit/40bca5ae9 Hexadecimal float sprintf http://perl5.git.perl.org/perl.git/commit/61e61fbc Hexadecimal float literals
which implement hexadecimal floats\, without depending on C99 or using system printf/strtod.
Hurrah! Thanks very much for this.
Earlier in this ticket\, Brian Fraser pointed out the existence of cases like this:
sub ap1 { 'z' } is 0x1.ap1\, '1z';
Jarkko reports having found no such affected code using grep.cpan.me\, and I freely stipulate that any code whose meaning changes in the presence of hex float literals (like this example) would be somewhat pathological. However\, I do find myself wondering whether hex float literals should be accepted only in the presence of a suitable feature.
Any thoughts? Am I worrying unnecessarily?
-- Aaron Crane ** http://aaroncrane.co.uk/
On Thursday-201408-14\, 9:13\, Aaron Crane wrote:
However\, I do find myself wondering whether hex float literals should be accepted only in the presence of a suitable feature.
I would wait for Andreas' CPAN smokes.
On Thursday-201408-14\, 9:13\, Aaron Crane wrote:
Jarkko reports having found no such affected code using grep.cpan.me\,
... and off-hand\, all the hits from grep.cpan.me seem to be in strings\, comments\, or alien formats (MATLAB).
On 08/14/2014 05:56 AM\, Jarkko Hietaniemi via RT wrote:
For better or worse\, I have now submitted
http://perl5.git.perl.org/perl.git/commit/dc91db6 Configure scan for the kind of long double we have http://perl5.git.perl.org/perl.git/commit/688e39e5 Configure scan for ldexpl http://perl5.git.perl.org/perl.git/commit/98181445 Perl_ldexp is one of ldexpl\, scalbnl\, or ldexp http://perl5.git.perl.org/perl.git/commit/40bca5ae9 Hexadecimal float sprintf http://perl5.git.perl.org/perl.git/commit/61e61fbc Hexadecimal float literals
which implement hexadecimal floats\, without depending on C99 or using system printf/strtod.
The dc91db6 will probably contain many bad guesses for the non-Configure platforms. Only smokes will tell.
Thanks for this.
Looking at the code\, one minor thing jumped out at me\, and that is we now have in handy.h two macros XDIGIT_VALUE(c) and READ_XDIGIT(s) (originally contributed by Yves IIRC) that I think are both faster and clearer than using PL_hexdigit\, and all previous core uses of strchr() and PL_hexdigit had been converted to use these.
On Thursday-201408-14\, 13:24\, Karl Williamson wrote:
Looking at the code\, one minor thing jumped out at me\, and that is we now have in handy.h two macros XDIGIT_VALUE(c) and READ_XDIGIT(s)
Thanks\, adding to the "followup todo" notes I'm keeping on this.
I now pushed a bunch of cleanups for this (thanks for all the comments)\, including fix for the one serious bug found so far: the code was broken on little-endian :-( [with usual 64-bit IEEE 754 double] but H.Merijn's HP-UX (PA) showed me the error of my ways.
I also tried to prepare for weirder combinations like having no quads to extract the mantissa bits to\, or the double-double platforms (which currently don't really extract the bits from the double-doubles but instead lossily use the frexp+ldexp path).
On Thursday-201408-14\, 9:16\, Jarkko Hietaniemi wrote:
I would wait for Andreas' CPAN smokes.
Andreas reports no breakages.
On 14 August 2014 19:24\, Karl Williamson \public@​khwilliamson\.com wrote:
On 08/14/2014 05:56 AM\, Jarkko Hietaniemi via RT wrote:
For better or worse\, I have now submitted
http://perl5.git.perl.org/perl.git/commit/dc91db6 Configure scan for the kind of long double we have http://perl5.git.perl.org/perl.git/commit/688e39e5 Configure scan for ldexpl http://perl5.git.perl.org/perl.git/commit/98181445 Perl_ldexp is one of ldexpl\, scalbnl\, or ldexp http://perl5.git.perl.org/perl.git/commit/40bca5ae9 Hexadecimal float sprintf http://perl5.git.perl.org/perl.git/commit/61e61fbc Hexadecimal float literals
which implement hexadecimal floats\, without depending on C99 or using system printf/strtod.
The dc91db6 will probably contain many bad guesses for the non-Configure platforms. Only smokes will tell.
Thanks for this.
Looking at the code\, one minor thing jumped out at me\, and that is we now have in handy.h two macros XDIGIT_VALUE(c) and READ_XDIGIT(s) (originally contributed by Yves IIRC) that I think are both faster and clearer than using PL_hexdigit\, and all previous core uses of strchr() and PL_hexdigit had been converted to use these.
Which I see you craftily rewrote to be more efficient. :-)
Nice stuff Karl.
I always learn cool bit-twiddling tricks from your code. Its nice.
Yves
-- perl -Mre=debug -e "/just|another|perl|hacker/"
On Thursday-201408-14\, 22:51\, Jarkko Hietaniemi wrote:
I now pushed a bunch of cleanups for this (thanks for all the comments)\, including fix for the one serious bug found so far: the code was broken on little-endian :-( [with usual 64-bit IEEE 754 double] but H.Merijn's HP-UX (PA) showed me the error of my ways.
I also tried to prepare for weirder combinations like having no quads to extract the mantissa bits to\, or the double-double platforms (which currently don't really extract the bits from the double-doubles but instead lossily use the frexp+ldexp path).
And another batch of cleanups. I now bravely think that big-endian works\, and that the "double-double" (e.g. AIX) also works.
Remaining issues:
- Windows running on Itanium? The canned configs all say that no long double for you\, though. But Itanium does have hardware IEEE 754 "quadruples". No compiler support?
- VMS? Runs across three architectures: Itanium or Alpha or VAX. I assumed 128-bit "true" IEEE 754 for all of them (and little-endian).
- the double-double support code was basically a wild guess. and even if it works\, the sprintf2 doesn't test for it.
On Fri\, Aug 15\, 2014 at 10:14 AM\, Jarkko Hietaniemi \jhi@​iki\.fi wrote:
- VMS? Runs across three architectures: Itanium or Alpha or VAX. I assumed 128-bit "true" IEEE 754 for all of them (and little-endian).
On OpenVMS I64 as of v5.21.2-156-gd8bcb4d with -Duse64bitint -Duselongdouble I get:
$ perl -e "$x = sprintf(qq/%A/\, 0);" assert error: expression = vend \< vdig + sizeof(vdig)\, in file D0:[craig.blead]sv.c;1 at line 11759
Dunno what's wrong yet.
On Fri\, Aug 15\, 2014 at 4:10 PM\, Craig A. Berry \craig\.a\.berry@​gmail\.com wrote:
On Fri\, Aug 15\, 2014 at 10:14 AM\, Jarkko Hietaniemi \jhi@​iki\.fi wrote:
- VMS? Runs across three architectures: Itanium or Alpha or VAX. I assumed 128-bit "true" IEEE 754 for all of them (and little-endian).
On OpenVMS I64 as of v5.21.2-156-gd8bcb4d with -Duse64bitint -Duselongdouble I get:
$ perl -e "$x = sprintf(qq/%A/\, 0);" assert error: expression = vend \< vdig + sizeof(vdig)\, in file D0:[craig.blead]sv.c;1 at line 11759
Dunno what's wrong yet.
The VMS debugger shows the following:
SV\Perl_sv_vcatpvfn_flags\%LINE 96933\vend: 2060475744 SV\Perl_sv_vcatpvfn_flags\%LINE 96933\vdig[0:31] [0]-[31]: 0 2060475712 DBG> evaluate sizeof(vdig) 32 DBG> evaluate vend \< vdig + sizeof(vdig) %DEBUG-I-SCALEADD\, pointer addition: scale factor of 1 applied to right argument 0
So the assertion 2060475744 \< 2060475712 + 32 is false because the LHS is actually equal\, not less than\, the RHS. I don't understand the code well enough to know what that means.
On Friday\, August 15\, 2014\, Craig A. Berry \craig\.a\.berry@​gmail\.com wrote:
On Fri\, Aug 15\, 2014 at 4:10 PM\, Craig A. Berry \<craig.a.berry@gmail.com \<javascript:;>> wrote:
On Fri\, Aug 15\, 2014 at 10:14 AM\, Jarkko Hietaniemi \<jhi@iki.fi \<javascript:;>> wrote:
- VMS? Runs across three architectures: Itanium or Alpha or VAX. I assumed 128-bit "true" IEEE 754 for all of them (and little-endian).
On OpenVMS I64 as of v5.21.2-156-gd8bcb4d with -Duse64bitint -Duselongdouble I get:
$ perl -e "$x = sprintf(qq/%A/\, 0);" assert error: expression = vend \< vdig + sizeof(vdig)\, in file D0:[craig.blead]sv.c;1 at line 11759
Dunno what's wrong yet.
The VMS debugger shows the following:
SV\Perl_sv_vcatpvfn_flags\%LINE 96933\vend: 2060475744 SV\Perl_sv_vcatpvfn_flags\%LINE 96933\vdig[0:31] [0]-[31]: 0 2060475712 DBG> evaluate sizeof(vdig) 32 DBG> evaluate vend \< vdig + sizeof(vdig) %DEBUG-I-SCALEADD\, pointer addition: scale factor of 1 applied to right argument 0
So the assertion 2060475744 \< 2060475712 + 32 is false because the LHS is actually equal\, not less than\, the RHS. I don't understand the code well enough to know what that means.
Neither do I\, I just recently wrote it...
That means that for some reason v (the pointer for the hexdigits (really 0-15\, not the '0'..'f') has extended all the way to the end of the the buffer. I see why i think... i will push a branch
-- There is this special biologist word we use for 'stable'. It is 'dead'. -- Jack Cohen
-----Original Message----- From: Craig A. Berry
$ perl -e "$x = sprintf(qq/%A/\, 0);" assert error: expression = vend \< vdig + sizeof(vdig)\, in file D0:[craig.blead]sv.c;1 at line 11759
At least that one works correctly for me on (debian wheezy) powerpc64 perl-5.21.3\, built from yesterday's git with -Duselongdouble (double-double).
Here's some values that don't look right\, however:
For 1e-298\, the 2 doubles (most significant first) are 0210be08d0527e1d and 0000000069c4b77f\, both of which are positive values.
If I do 'printf "%A"\, 1e-298;' then I get: 0XB.E08D0527E1D000069C4B77FP-991
Those 4 zeroes in the middle are wrong - they should appear at the end. (This probably just means that the value of the exponent of the least significant double has been miscalculated.) But I think it's also incorrect at the start. The most siginificant 13 bits of the mantissa (including the implied leading '1') are 1000010111110 - which doesn't correlate at all well with 0XB.E0 Data::Float::DoubleDouble gives the following hex value of the double-double 1e-298: +0x1.0be08d0527e1d69c4b77f000000p-990
(In the Data::Float::DoubleDouble representation\, I opted to have the first character be the leading 0 or 1 .... which leaves 105 bits .... which needs 27 hex characters\, the last of which can only be either 8 or 0 (as the last 3 bits are always zero). I did that to retain some correlation between the representation of the value\, and the actual hex-encoding of the double-double. And then\, as it turns out\, C's "%La" does exactly the same formatting\, which is quite fortuitous ... hell\, I didn't even know C was capable of hex formatting of double-doubles until just now !)
Another value I looked at was 193e-3. In this case the 2 doubles are 3fc8b4395810624e and bc56872b020c49ba - the first of which is a positive value; the second being *negative*. Therefore the actual value of the double-double is going to be less than the value of the most significant double. However\, 'printf "%A"\, 193e-3;' outputs: 0XB.4395810624E872B020C49BAP-4
Again\, the prefix looks wrong - most siginificant 13 bits are 1100010110100. Also\, if the most significant double ends in "4395810624E" we would expect that \, following the subtraction\, we would see "4395810624D" (or less)\, but we still see "4395810624E" in there.
Data::Float::DoubleDouble says +0x1.8b4395810624dd2f1a9fbe76c8cp-3 (and I'll have to investigate how the final hex char came to be something other than "8" or "0" ;-)
I also looked at 2 ** 200. That came out as 0X0P+0. I'm guessing it has looked at the mantissa\, seen only zeroes \, forgotten about the implied leading "1"\, and decided the value was zero.
The fourth value I looked at was 2 ** 0.5. As with 193e-3\, the least significant double is negative - which again seems to have been overlooked. The 2 doubles are 3ff6a09e667f3bcd and bc9bdd3413b26456\, and 'printf "%A"\, 2 ** 0.5;' outputs: 0XA.09E667F3BCDDD3413B26456P-1 Correct value is 0x1.6a09e667f3bcc908b2fb1366ea8p0
The actual script I ran is attached (try.pl)\, but to run it you'll need to be on a machine whose long double is double-double\, and whose perl was built with -Duselongdouble. Also attached is the output of the script (out.txt).
Btw\, I've just checked that the above Data::Float::DoubleDouble values agree with C's "%La" output\, and they do - except for the final "c" in the second example (which should be 8 ... and I'll have to work out how that 107th bit got set.)
Thanks for taking this on\, Jarrko. Apologies that I haven't come up with something more constructive than "this is wrong and that aint right".
Cheers\, Rob
1000010111110000010001101000001010010011111100001110101101001110001001011011 10111111100000000000000000000000 The 2 doubles (most siginificant first): (+) 0210be08d0527e1d\, (+) 0000000069c4b77f 0210be08d0527e1d0000000069c4b77f 0XB.E08D0527E1D000069C4B77FP-991 +0x1.0be08d0527e1d69c4b77f000000p-990
1100010110100001110010101100000010000011000100100110111010010111100011010100 11111101111100111011011001000110 The 2 doubles (most siginificant first): (+) 3fc8b4395810624e\, (-) bc56872b020c49ba 3fc8b4395810624ebc56872b020c49ba 0XB.4395810624E872B020C49BAP-4 +0x1.8b4395810624dd2f1a9fbe76c8cp-3
1000000000000000000000000000000000000000000000000000000000000000000000000000 00000000000000000000000000000000 The 2 doubles (most siginificant first): (+) 4c70000000000000\, (+) 0000000000000000 4c700000000000000000000000000000 0X0P+0 +0x1.000000000000000000000000000p200
1011010100000100111100110011001111111001110111100110010010000100010110010111 11011000100110110011011101010100 The 2 doubles (most siginificant first): (+) 3ff6a09e667f3bcd\, (-) bc9bdd3413b26456 3ff6a09e667f3bcdbc9bdd3413b26456 0XA.09E667F3BCDDD3413B26456P-1 +0x1.6a09e667f3bcc908b2fb1366ea8p0
-----Original Message----- From: sisyphus1@optusnet.com.au Sent: Sunday\, August 17\, 2014 8:40 PM
Another value I looked at was 193e-3. [snip] Data::Float::DoubleDouble says +0x1.8b4395810624dd2f1a9fbe76c8cp-3 (and I'll have to investigate how the final hex char came to be something other than "8" or "0" ;-)
I don't think this is central to this thread.
The setting of the last hex char to "c" arises from the (known) perl bug where the value that perl assigns to some NVs is off by one or more ULPs.
As regards 193e-3\, instead of assigning correct doubles (3fc8b4395810624e and bc56872b020c49bc)\, perl has assigned bc56872b020c49ba as the least significant double. This actually means that perl has assigned an illegitimate value to the double-double. I think 3fc8b4395810624ebc56872b020c49ba is not a valid double-double representation - and this is what throws out the calculations performed by D::F::DD.
We can force perl to assign the correct double-double representation (and this is the only way of doing it that I know of) by doing:
use Math::NV qw(:all); $nv = nv('193e-3');
If we do that then the correct representation of 3fc8b4395810624ebc56872b020c49bc gets assigned to $nv\, and D:F::DD then provides correct results.
I suppose D:F:DD could strive to detect and correct perl's mistakes\, but that is not a high priority for me.
Cheers\, Rob
On Sunday-201408-17\, 6:40\, sisyphus1@optusnet.com.au wrote:
At least that one works correctly for me on (debian wheezy) powerpc64 perl-5.21.3\, built from yesterday's git with -Duselongdouble (double-double).
Here's some values that don't look right\, however:
For 1e-298\, the 2 doubles (most significant first) are 0210be08d0527e1d and 0000000069c4b77f\, both of which are positive values
The currently-in-blead version is all sorts of wrong for IEEE 754 128 long doubles\, and for double-doubles\, sorry about that. I'm trying to stop breaking things\, with help from Craig.
Thanks for taking this on\, Jarrko. Apologies that I haven't come up with something more constructive than "this is wrong and that aint right".
Get thee the http://perl5.git.perl.org/perl.git and retry.
It's probably still quite wrong for double-doubles\, but at least it should be less wrong.
-----Original Message----- From: Jarkko Hietaniemi
It's probably still quite wrong for double-doubles\, but at least it should be less wrong.
The value expressed for 2 ** 200 is a big improver ;-) It's now at 0X01P199 (which is off by a power of 2).
Of the other values I looked at last night\, they seem to have changed only in the leading digits. What was "0X0A.BCDEF..." has been transformed into "0X01.ABCDEF ..."\, though the correct form begins "0X01.HABCDEF... " (where H stands for some hex digit).
For example\, yesterday's blead presented 1e-298 as: 0XB.E08D0527E1D000069C4B77FP-991
Today's blead presents it as: 0X1.BE08D0527E1D000069C4B77FP-991
And the correct rendition is: 0X1.0BE08D0527E1D69C4B77FP-990
Even for an easily representable float such as 128.625 (where the entire value is held in the most siginificant double and the least significant double is 0)\, today's blead presents it as 0X1.14P+6\, but correct rendition is 0X1.014P+7.
Anyway - good luck with it. (It would be nice to see this up and running with double-doubles\, but it's not something that I'm reliant upon.)
Is it not possible for you to achieve the desired result via C's %La/%LA formatting ?
Cheers\, Rob
The value expressed for 2 ** 200 is a big improver ;-) It's now at 0X01P199 (which is off by a power of 2).
Of the other values I looked at last night\, they seem to have changed only in the leading digits. What was "0X0A.BCDEF..." has been transformed into "0X01.ABCDEF ..."\, though the correct form begins "0X01.HABCDEF... " (where H stands for some hex digit).
If you could do:
grep longdblkind config.sh
I'll also email you a test code\, the output of which would be of interest.
For example\, yesterday's blead presented 1e-298 as: 0XB.E08D0527E1D000069C4B77FP-991
Today's blead presents it as: 0X1.BE08D0527E1D000069C4B77FP-991
And the correct rendition is: 0X1.0BE08D0527E1D69C4B77FP-990
Even for an easily representable float such as 128.625 (where the entire value is held in the most siginificant double and the least significant double is 0)\, today's blead presents it as 0X1.14P+6\, but correct rendition is 0X1.014P+7.
Anyway - good luck with it. (It would be nice to see this up and running with double-doubles\, but it's not something that I'm reliant upon.)
Is it not possible for you to achieve the desired result via C's %La/%LA formatting ?
That would leave us dependent on the vendors' implementations of C99. Two problems with this:
(1) C99 - which we do not require\, and enabling of which requires often various contortions while compiling: different cc wrapper\, different flags\, different libraries.
(2) there's wiggle room in the spec\, which inevitably leads into diverging implementations. One example of wiggle room is whether to print the trailing zero nybbles. Another is the choice of lead xdigit/exponent alignment. Another huge one is what the heck to do with the long doubles... at least with our own implementation we get to do our own mistakes. (Cue in http://xkcd.com/927/)
Cheers\, Rob
This is way\, way implemented already.
@jhi - Status changed from 'open' to 'resolved'
Migrated from rt.perl.org#122219 (status was 'resolved')
Searchable as RT122219$