Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.85k stars 527 forks source link

String plus zero inconsistent across platforms #4046

Closed p5pRT closed 20 years ago

p5pRT commented 22 years ago

Migrated from rt.perl.org#7059 (status was 'resolved')

Searchable as RT7059$

p5pRT commented 22 years ago

From kevin@twicom.com

This is a bug report for perl from kpt3@​twicom.com\, generated with the help of perlbug 1.33 running under perl v5.6.1.


The following line is inconsistent across different platforms​:

print "0x1" + 0 . "\n";

Perl on Linux (v5.6.0) gives 1 Perl in cygwin(v5.6.1) gives 0 ActiveState (v5.6.0) gives 0

The Linux Perl came with RedHat 7.1

This seems to me to be a rather fundemental problem.

Kevin Twidle



Flags​:   category=core   severity=high


Site configuration information for perl v5.6.1​:

Configured by Fifer at Thu Dec 21 21​:37​:10 2000.

Summary of my perl5 (revision 5.0 version 6 subversion 1) configuration​:   Platform​:   osname=cygwin\, osvers=1.1.6(0.3032)\, archname=cygwin   uname='cygwin_nt-5.0 fifer 1.1.6(0.3032) 2000-11-21 21​:00 i686 unknown '   config_args='-de'   hint=recommended\, useposix=true\, d_sigaction=define   usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef   useperlio=undef d_sfio=undef uselargefiles=define usesocks=undef   use64bitint=undef use64bitall=undef uselongdouble=undef   Compiler​:   cc='gcc'\, ccflags ='-DPERL_USE_SAFE_PUTENV -DHAS_SBRK_PROTO -fno-strict-aliasing'\,   optimize='-O2'\,   cppflags='-DPERL_USE_SAFE_PUTENV -DHAS_SBRK_PROTO -fno-strict-aliasing'   ccversion=''\, gccversion='2.95.2-5 19991024 (cygwin experimental)'\, gccosandvers=''   intsize=4\, longsize=4\, ptrsize=4\, doublesize=8\, byteorder=1234   d_longlong=define\, longlongsize=8\, d_longdbl=define\, longdblsize=12   ivtype='long'\, ivsize=4\, nvtype='double'\, nvsize=8\, Off_t='off_t'\, lseeksize=4   alignbytes=8\, usemymalloc=y\, prototype=define   Linker and Libraries​:   ld='ld2'\, ldflags =' -s -L/usr/local/lib'   libpth=/usr/local/lib /usr/lib /lib   libs=-lgdbm -lcrypt   perllibs=-lcrypt   libc=/usr/lib/libc.a\, so=dll\, useshrplib=true\, libperl=libperl5_6_1.a   Dynamic Linking​:   dlsrc=dl_dlopen.xs\, dlext=dll\, d_dlsymun=undef\, ccdlflags=' -s'   cccdlflags=' '\, lddlflags=' -s -L/usr/local/lib'

Locally applied patches​:   v5.6.1-TRIAL1


@​INC for perl v5.6.1​:   /usr/lib/perl5/5.6.1/cygwin   /usr/lib/perl5/5.6.1   /usr/lib/perl5/site_perl/5.6.1/cygwin   /usr/lib/perl5/site_perl/5.6.1   /usr/lib/perl5/site_perl   .


Environment for perl v5.6.1​:   HOME=/home/kevin   LANG (unset)   LANGUAGE (unset)   LD_LIBRARY_PATH (unset)   LOGDIR (unset)   PATH=/usr/local/bin​:/usr/bin​:/bin​:/cygdrive/c/WINNT/system32​:/cygdrive/c/WIN NT​:/cygdrive/c/WINNT/System32/Wbem   PERL_BADLANG (unset)   SHELL=/bin/sh

p5pRT commented 22 years ago

From @rspier

This appears to be a system-specific problem with whatever is used to implement Atof (i.e. Perl_my_atof). It's strtold on some systems\, (Or atof or a bunch of other things). (And it's the _implementation_ thats the problem\, not the choice of function\, afaict.)

Some of the systems _do_ parse 0x1 as 1\, others parse it as 0.

The bad conversion (in this example) happens at Line 1822 in sv.c (5.6.1) (line 2331 in perl-current) in Perl_sv_2nv.

I seem to remember some discussion for the inclusion of a standard strtold (or similar) function into the core having to do with 64bitness stuff. That would also serve this bug.

-R

p.s. Jarrko- What do you think about including the mod_perl .gdbinit file in the core? It's less than 5k and it makes debugging much easier.

Linux
5.00503 5.6.0 5.6.1 perl-current   all print 1.

FreeBSD   1

5.6.0 Windows (AS 623)   prints 0

Solaris​: 4.036 5.00503 5.6.0 all print 0

Irix​: 4.036 5.001 all print 0.

Kevin Twidle writes​:

This is a bug report for perl from kpt3@​twicom.com\, generated with the help of perlbug 1.33 running under perl v5.6.1.

----------------------------------------------------------------- The following line is inconsistent across different platforms​:

print "0x1" + 0 . "\n";

Perl on Linux (v5.6.0) gives 1 Perl in cygwin(v5.6.1) gives 0 ActiveState (v5.6.0) gives 0

The Linux Perl came with RedHat 7.1

This seems to me to be a rather fundemental problem.

Kevin Twidle ----------------------------------------------------------------- --- Flags​: category=core severity=high --- Site configuration information for perl v5.6.1​:

Configured by Fifer at Thu Dec 21 21​:37​:10 2000.

Summary of my perl5 (revision 5.0 version 6 subversion 1) configuration​: Platform​: osname=cygwin\, osvers=1.1.6(0.3032)\, archname=cygwin uname='cygwin_nt-5.0 fifer 1.1.6(0.3032) 2000-11-21 21​:00 i686 unknown ' config_args='-de' hint=recommended\, useposix=true\, d_sigaction=define usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef useperlio=undef d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef Compiler​: cc='gcc'\, ccflags ='-DPERL_USE_SAFE_PUTENV -DHAS_SBRK_PROTO -fno-strict-aliasing'\, optimize='-O2'\, cppflags='-DPERL_USE_SAFE_PUTENV -DHAS_SBRK_PROTO -fno-strict-aliasing' ccversion=''\, gccversion='2.95.2-5 19991024 (cygwin experimental)'\, gccosandvers='' intsize=4\, longsize=4\, ptrsize=4\, doublesize=8\, byteorder=1234 d_longlong=define\, longlongsize=8\, d_longdbl=define\, longdblsize=12 ivtype='long'\, ivsize=4\, nvtype='double'\, nvsize=8\, Off_t='off_t'\, lseeksize=4 alignbytes=8\, usemymalloc=y\, prototype=define Linker and Libraries​: ld='ld2'\, ldflags =' -s -L/usr/local/lib' libpth=/usr/local/lib /usr/lib /lib libs=-lgdbm -lcrypt perllibs=-lcrypt libc=/usr/lib/libc.a\, so=dll\, useshrplib=true\, libperl=libperl5_6_1.a Dynamic Linking​: dlsrc=dl_dlopen.xs\, dlext=dll\, d_dlsymun=undef\, ccdlflags=' -s' cccdlflags=' '\, lddlflags=' -s -L/usr/local/lib'

Locally applied patches​: v5.6.1-TRIAL1

--- @​INC for perl v5.6.1​: /usr/lib/perl5/5.6.1/cygwin /usr/lib/perl5/5.6.1 /usr/lib/perl5/site_perl/5.6.1/cygwin /usr/lib/perl5/site_perl/5.6.1 /usr/lib/perl5/site_perl .

--- Environment for perl v5.6.1​: HOME=/home/kevin LANG (unset) LANGUAGE (unset) LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/usr/local/bin​:/usr/bin​:/bin​:/cygdrive/c/WINNT/system32​:/cygdrive/c/WIN NT​:/cygdrive/c/WINNT/System32/Wbem PERL_BADLANG (unset) SHELL=/bin/sh

p5pRT commented 22 years ago

From @jhi

Hmmm... I really\, really vaguely remember seeing this portability bug manifesting itself earlier.

The bad conversion (in this example) happens at Line 1822 in sv.c (5.6.1) (line 2331 in perl-current) in Perl_sv_2nv.

I seem to remember some discussion for the inclusion of a standard strtold (or similar) function into the core having to do with 64bitness stuff. That would also serve this bug.

Yup.

-R

p.s. Jarrko- What do you think about including the mod_perl .gdbinit

s/rrk/rkk/

file in the core? It's less than 5k and it makes debugging much easier.

Sure. How uptodate it is?

p5pRT commented 22 years ago

From @jhi

Ho-hum. It's even more fun than that. Linux/x86 perl-current​:

papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x0"' Argument "0x0" isn't numeric in addition (+) at -e line 1. Argument "0x1" isn't numeric in addition (+) at -e line 1. 1 papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x1"' Argument "0x1" isn't numeric in addition (+) at -e line 1. Argument "0x1" isn't numeric in addition (+) at -e line 1. 2

So far\, so good... now it gets interesting.

papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x2"' Argument "0x2" isn't numeric in addition (+) at -e line 1. Argument "0x1" isn't numeric in addition (+) at -e line 1. 2 papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x3"' Argument "0x3" isn't numeric in addition (+) at -e line 1. Argument "0x1" isn't numeric in addition (+) at -e line 1. 2.5 papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x4"' Argument "0x4" isn't numeric in addition (+) at -e line 1. Argument "0x1" isn't numeric in addition (+) at -e line 1. 2 papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x5"' Argument "0x5" isn't numeric in addition (+) at -e line 1. Argument "0x1" isn't numeric in addition (+) at -e line 1. 2.25 papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x6"' Argument "0x6" isn't numeric in addition (+) at -e line 1. Argument "0x1" isn't numeric in addition (+) at -e line 1. 2.5 papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x7"' Argument "0x7" isn't numeric in addition (+) at -e line 1. Argument "0x1" isn't numeric in addition (+) at -e line 1. 2.75 papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x8"' Argument "0x8" isn't numeric in addition (+) at -e line 1. Argument "0x1" isn't numeric in addition (+) at -e line 1. 2 papaija​:/tmp/jhi/perl ;

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

GNU Libc's stdlib.h defines strtold (and strtod) thus​:

  extern __inline long double   strtold (__const char * __nptr\, char ** __endptr)   {   return __strtold_internal (__nptr\, __endptr\, 0);   }

For some reason\, strtold does not take a "base" parameter\, unlike\, say\, strtoul. It behaves as if a base of 0 was specified\, and converts "0x1" to 1 (see strtol(3) for details).

(The 0 argument to __strtold_internal above indicates whether or not to parse locale-dependent number groupings. It has nothing to do with the base.)

So far\, so good... now it gets interesting. [...]

papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x5"' Argument "0x5" isn't numeric in addition (+) at -e line 1. Argument "0x1" isn't numeric in addition (+) at -e line 1. 2.25

Similar interesting behaviour is displayed by any hexadecimal number in the string\, apparently due to a bug in __strtold_internal. A cursory glance at the glibc 2.1.2 source suggests that the bug is probably not trivial to fix.

Is anyone working on an strtold replacement for the core?

- ams

p5pRT commented 22 years ago

From @rspier

Looks decently up to date. It could bear some expansion\, but it's a great start.

I removed some mod_perl specific stuff\, but that wasn't much.

Here's a file-creation patch.

Inline Patch ```diff --- /dev/null Fri Mar 23 23:37:44 2001 +++ Porting/gdbmacros Wed May 30 11:26:27 2001 @@ -0,0 +1,190 @@ +# +# GDB debugging macros file +# (thanks to the mod_perl team for developing this.) +# +# either copy this file to .gdbinit in the directory you are running +# the gdb from, or at the gdb prompt: +# (gdb) source path-to-perl-source/Porting/gdbmacros +# +#some handy debugging macros, hopefully you'll never need them +#some don't quite work, like dump_hv and hv_fetch, +#where's the bloody manpage for .gdbinit syntax? + +define STpvx + print ((XPV*) (PL_stack_base [ax + ($arg0)] )->sv_any )->xpv_pv +end + +define TOPs + print ((XPV*) (**sp)->sv_any )->xpv_pv +end + +define curstash + print ((XPVHV*) (PL_curstash)->sv_any)->xhv_name +end + +define defstash + print ((XPVHV*) (PL_defstash)->sv_any)->xhv_name +end + +define curinfo + printf "%d:%s\n", PL_curcop->cop_line, \ + ((XPV*)(*(XPVGV*)PL_curcop->cop_filegv->sv_any)\ + ->xgv_gp->gp_sv->sv_any)->xpv_pv +end + +define SvPVX +print ((XPV*) ($arg0)->sv_any )->xpv_pv +end + +define SvCUR + print ((XPV*) ($arg0)->sv_any )->xpv_cur +end + +define SvLEN + print ((XPV*) ($arg0)->sv_any )->xpv_len +end + +define SvEND + print (((XPV*) ($arg0)->sv_any )->xpv_pv + ((XPV*)($arg0)->sv_any )->xpv_cur) - 1 +end + +define SvSTASH + print ((XPVHV*)((XPVMG*)($arg0)->sv_any )->xmg_stash)->sv_any->xhv_name +end + +define SvTAINTED + print ((($arg0)->sv_flags & (0x00002000 |0x00004000 |0x00008000 )) && Perl_sv_tainted ($arg0)) +end + +define SvTRUE + print ( !$arg0 ? 0 : (($arg0)->sv_flags & 0x00040000 ) ? ((PL_Xpv = (XPV*)($arg0)->sv_any ) && (*PL_Xpv ->xpv_pv > '0' || PL_Xpv ->xpv_cur > 1 || (PL_Xpv ->xpv_cur && *PL_Xpv ->xpv_pv != '0')) ? 1 : 0) : (($arg0)->sv_flags & 0x00010000 ) ? ((XPVIV*) ($arg0)->sv_any )->xiv_iv != 0 : (($arg0)->sv_flags & 0x00020000 ) ? ((XPVNV*)($arg0)->sv_any )->xnv_nv != 0.0 : Perl_sv_2bool ($arg0) ) +end + +define GvHV + set $hv = (((((XPVGV*)($arg0)->sv_any ) ->xgv_gp) )->gp_hv) +end + +define GvSV + print ((XPV*) ((((XPVGV*)($arg0)->sv_any ) ->xgv_gp) ->gp_sv )->sv_any )->xpv_pv +end + +define GvNAME + print (((XPVGV*)($arg0)->sv_any ) ->xgv_name) +end + +define GvFILEGV + print ((XPV*) ((((XPVGV*)$arg0->filegv)->xgv_gp)->gp_sv)->sv_any)->xpv_pv +end + +define CvNAME + print ((XPVGV*)(((XPVCV*)($arg0)->sv_any)->xcv_gv)->sv_any)->xgv_name +end + +define CvSTASH + print ((XPVHV*)(((XPVGV*)(((XPVCV*)($arg0)->sv_any)->xcv_gv)->sv_any)->xgv_stash)->sv_any)->xhv_name +end + +define CvDEPTH + print ((XPVCV*)($arg0)->sv_any )->xcv_depth +end + +define CvFILEGV + print ((XPV*) ((((XPVGV*)((XPVCV*)($arg0)->sv_any )->xcv_filegv)->xgv_gp)->gp_sv)->sv_any)->xpv_pv +end + +define SVOPpvx + print ((XPV*) ( ((SVOP*)$arg0)->op_sv)->sv_any )->xpv_pv +end + +define HvNAME + print ((XPVHV*)$arg0->sv_any)->xhv_name +end + +define HvKEYS + print ((XPVHV*) ($arg0)->sv_any)->xhv_keys +end + +define AvFILL + print ((XPVAV*) ($arg0)->sv_any)->xav_fill +end + +define dumpav + set $n = ((XPVAV*) ($arg0)->sv_any)->xav_fill + set $i = 0 + while $i <= $n + set $sv = *Perl_av_fetch($arg0, $i, 0) + printf "[%u] -> `%s'\n", $i, ((XPV*) ($sv)->sv_any )->xpv_pv + set $i = $i + 1 + end +end + +define dumphv + set $n = ((XPVHV*) ($arg0)->sv_any)->xhv_keys + set $i = 0 + set $key = 0 + set $klen = 0 + Perl_hv_iterinit($arg0) + while $i <= $n + set $sv = Perl_hv_iternextsv($arg0, &$key, &$klen) + printf "%s = `%s'\n", $key, ((XPV*) ($sv)->sv_any )->xpv_pv + set $i = $i + 1 + end +end + +define hvfetch + set $klen = strlen($arg1) + set $sv = *Perl_hv_fetch($arg0, $arg1, $klen, 0) + printf "%s = `%s'\n", $arg1, ((XPV*) ($sv)->sv_any )->xpv_pv +end + +define hvINCval + set $hv = (((((XPVGV*)(PL_incgv)->sv_any)->xgv_gp))->gp_hv) + set $klen = strlen($arg0) + set $sv = *Perl_hv_fetch($hv, $arg0, $klen, 0) + printf "%s = `%s'\n", $arg0, ((XPV*) ($sv)->sv_any )->xpv_pv +end + +define dumpany + set $sv = Perl_newSVpv("use Data::Dumper; Dumper \\",0) + set $void = Perl_sv_catpv($sv, $arg0) + set $dump = perl_eval_pv(((XPV*) ($sv)->sv_any )->xpv_pv, 1) + printf "%s = `%s'\n", $arg0, ((XPV*) ($dump)->sv_any )->xpv_pv +end + +define dumpanyrv + set $rv = Perl_newRV((SV*)$arg0) + set $rvpv = perl_get_sv("main::DumpAnyRv", 1) + set $void = Perl_sv_setsv($rvpv, $rv) + set $sv = perl_eval_pv("use Data::Dumper; Dumper $::DumpAnyRv",1) + printf "`%s'\n", ((XPV*) ($sv)->sv_any )->xpv_pv +end + +define svpeek + set $pv = Perl_sv_peek((SV*)$arg0) + printf "%s\n", $pv +end + +define caller + set $sv = perl_eval_pv("scalar caller", 1) + printf "caller = %s\n", ((XPV*) ($sv)->sv_any )->xpv_pv +end + +define cluck + set $sv = perl_eval_pv("Carp::cluck(); `tail '$Apache::ErrLog'`", 1) + printf "%s\n", ((XPV*) ($sv)->sv_any )->xpv_pv +end + +define longmess + set $sv = perl_eval_pv("Carp::longmess()", 1) + printf "%s\n", ((XPV*) ($sv)->sv_any )->xpv_pv +end + +define shortmess + set $sv = perl_eval_pv("Carp::shortmess()", 1) + printf "%s\n", ((XPV*) ($sv)->sv_any )->xpv_pv +end + +define perl_get_sv + set $sv = perl_get_sv($arg0, 0) + printf "%s\n", $sv ? ((XPV*) ((SV*)$sv)->sv_any)->xpv_pv : "undef" +end ```
p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

file in the core? It's less than 5k and it makes debugging much easier.

Sure. How uptodate it is?

Looks decently up to date. It could bear some expansion\, but it's a great start.

as i mentioned in my reply to simon's original message\, many of the macros do not work with -Dusethreads. i would really like to see something crafted with Devel​::DebugInit to generate as many macros as possible. and for the handcrafted ones\, replaces PL_foo with my_perl->Ifoo.

From dougm@​covalent.net Wed Apr 18 09​:32​:22 2001 -0700 Status​: X-Status​: X-Keywords​: Date​: Wed\, 18 Apr 2001 09​:32​:21 -0700 (PDT) From​: Doug MacEachern \dougm@&#8203;covalent\.net To​: Simon Cozens \simon@&#8203;netthink\.co\.uk cc​: perl5-porters@​perl.org Subject​: Re​: The mod_perl .gdbinit In-Reply-To​: \20010418093850\.A8131@&#8203;netthink\.co\.uk Message-ID​: \Pine\.LNX\.4\.21\.0104180927220\.17271\-100000@&#8203;mako\.covalent\.net MIME-Version​: 1.0 Content-Type​: TEXT/PLAIN; charset=US-ASCII

On Wed\, 18 Apr 2001\, Simon Cozens wrote​:

I've just had a look at these\, and they're *SOOOO* useful. Is there any chance we can steal some of them for the core?

most of them were originally generated by Devel​::DebugInit\, some crafted by hand (e.g. curinfo\, my favorite). i think it would be worth revisiting Devel​::DebugInit to get all of them generated\, including the handcrafted ones to deal with PL_* so they are expanded properly for ithread Perls and other flavors.

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

String Binary Binary fraction Decimal fraction 0x1 1 1.0 1.0 0x2 10 1.0 1.0 0x3 11 1.1 1.5 0x4 100 1.00 1.0 0x5 101 1.01 1.25 0x6 110 1.10 1.5 0x7 111 1.11 1.75 0x8 100 1.00 1.0 0x9 1001 1.001 1.125 0xA 1010 1.010 1.25 0xB 1011 1.011 1.375

Perl is taking the second number as a binary fraction with a fixed decimal point. I assume that it is reading the first number in a similar way.

I cannot test it because trying perl -wle 'print "0x4" + "0x5"'

Linux (5.6.0) gives Argument "0x4" isn't numeric in addition (+) at -e line 1. Argument "0x5" isn't numeric in addition (+) at -e line 1. 9

Cygwin(5.6.1) gives Argument "0x4" isn't numeric in addition (+) at -e line 1. Argument "0x5" isn't numeric in addition (+) at -e line 1. 0

Kevin

----- Original Message ----- From​: "Jarkko Hietaniemi" \jhi@&#8203;iki\.fi To​: "Robert Spier" \rspier@&#8203;pobox\.com Cc​: "Kevin Twidle" \kevin@&#8203;twicom\.com; \perl5\-porters@&#8203;perl\.org; \perlbug@&#8203;rfi\.net Sent​: Tuesday\, May 29\, 2001 8​:45 PM Subject​: Re​: [ID 20010529.006] String plus zero inconsistent across platforms

Linux 5.00503 5.6.0 5.6.1 perl-current all print 1.

Ho-hum. It's even more fun than that. Linux/x86 perl-current​:

papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x0"' Argument "0x0" isn't numeric in addition (+) at -e line 1. Argument "0x1" isn't numeric in addition (+) at -e line 1. 1 papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x1"' Argument "0x1" isn't numeric in addition (+) at -e line 1. Argument "0x1" isn't numeric in addition (+) at -e line 1. 2

So far\, so good... now it gets interesting.

papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x2"' Argument "0x2" isn't numeric in addition (+) at -e line 1. Argument "0x1" isn't numeric in addition (+) at -e line 1. 2 papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x3"' Argument "0x3" isn't numeric in addition (+) at -e line 1. Argument "0x1" isn't numeric in addition (+) at -e line 1. 2.5 papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x4"' Argument "0x4" isn't numeric in addition (+) at -e line 1. Argument "0x1" isn't numeric in addition (+) at -e line 1. 2 papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x5"' Argument "0x5" isn't numeric in addition (+) at -e line 1. Argument "0x1" isn't numeric in addition (+) at -e line 1. 2.25 papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x6"' Argument "0x6" isn't numeric in addition (+) at -e line 1. Argument "0x1" isn't numeric in addition (+) at -e line 1. 2.5 papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x7"' Argument "0x7" isn't numeric in addition (+) at -e line 1. Argument "0x1" isn't numeric in addition (+) at -e line 1. 2.75 papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x8"' Argument "0x8" isn't numeric in addition (+) at -e line 1. Argument "0x1" isn't numeric in addition (+) at -e line 1. 2 papaija​:/tmp/jhi/perl ;

-- $jhi++; # http​://www.iki.fi/jhi/ # There is this special biologist word we use for 'stable'. # It is 'dead'. -- Jack Cohen

p5pRT commented 22 years ago

From @jhi

==== //depot/perl/sv.c#402 - /u/vieraat/vieraat/jhi/pp4/perl/sv.c ==== Index​: perl/sv.c

Inline Patch ```diff --- perl/sv.c.~1~ Thu May 31 05:04:49 2001 +++ perl/sv.c Thu May 31 05:04:49 2001 @@ -1871,6 +1871,8 @@ #endif { NV d; + char *s; + bool firstzero; #ifdef HAS_STRTOL /* Hopefully trace flow will optimise this away where possible */ @@ -1878,7 +1880,15 @@ #endif /* It wasn't an integer, or it overflowed, or we don't have strtol. Do things the slow way - check if it's a UV etc. */ - d = Atof(SvPVX(sv)); + s = SvPVX(sv); + firstzero = s[0] == '0'; + d = + (firstzero && toLOWER(s[1]) == 'x') ? + (NV)Strtoul(s + 2, 0, 16) : + (firstzero && toLOWER(s[1]) == 'b' ? + (NV)Strtoul(s + 2, 0, 2) : + (firstzero && s[1] >= '0' && s[1] <= '7' ? + (NV)Strtoul(s + 1, 0, 8) : Atof(s))); if (SvTYPE(sv) < SVt_PVNV) sv_upgrade(sv, SVt_PVNV); @@ -2326,7 +2336,9 @@ #endif } else if (SvPOKp(sv) && SvLEN(sv)) { - if (ckWARN(WARN_NUMERIC) && !SvIOKp(sv) && !looks_like_number(sv)) + int numtype = looks_like_number(sv); + + if (ckWARN(WARN_NUMERIC) && !SvIOKp(sv) && !numtype) not_a_number(sv); SvNVX(sv) = Atof(SvPVX(sv)); #ifdef NV_PRESERVES_UV @@ -2348,7 +2360,6 @@ } else { /* Is it something we can run through strtol etc (ie no trailing exponent part)? */ - int numtype = looks_like_number(sv); /* XXX probably should cache this if called above */ if (!(numtype & End of Patch. ```
p5pRT commented 22 years ago

From @vanstyn

I don't think we do​: for a long time we have taught people that "37xa" numifies to 37\, and I think it is unreasonable and unhelpful now to change "0xa" to numify to something other than 0. I think your example should numify to 0 + 10 + 0 = 10.

Hugo

p5pRT commented 22 years ago

From @jhi

Well\, in that case we must go in the other direction and stop some Atof()s from numifying "0xa" to 10.

Hugo

p5pRT commented 22 years ago

From @vanstyn

Attached is a start\, which passes all but pragma/locale.t here. I think it needs the following changes​: - overflow/underflow checking - locale support - ? unicode support - inf/nan support - whitespace tolerance (not sure what is permitted) - (optional) rewrite 'apply the exponent' to be faster - add tests​: it is surprising that only locale of the above missing features   caused any test failures

I'd suggest also that Perl_atof2 be redefined - it is confusing having a macro that converts a variable reference to its address without it being named so as obviously to be a macro. Since the only references to it are in the Perl_my_atof function\, I think this should be safe.

This patch requires a 'make regen_headers'.

Hugo

Inline Patch ```diff --- embed.pl.old Sun May 27 00:32:32 2001 +++ embed.pl Thu May 31 15:28:58 2001 @@ -2592,3 +2592,4 @@ Apd |STRLEN |sv_utf8_upgrade_flags|SV *sv|I32 flags Apd |char* |sv_pvn_force_flags|SV* sv|STRLEN* lp|I32 flags Ap |char* |sv_2pv_flags |SV* sv|STRLEN* lp|I32 flags +Ap |char* |my_atof2 |const char *s|NV* value --- perl.h.old Thu May 31 02:17:32 2001 +++ perl.h Thu May 31 15:34:23 2001 @@ -1311,24 +1311,8 @@ # endif #endif -#if !defined(Perl_atof) && defined(USE_LONG_DOUBLE) && defined(HAS_LONG_DOUBLE) -# if !defined(Perl_atof) && defined(HAS_STRTOLD) -# define Perl_atof(s) (NV)strtold(s, (char**)NULL) -# endif -# if !defined(Perl_atof) && defined(HAS_ATOLF) -# define Perl_atof (NV)atolf -# endif -# if !defined(Perl_atof) && defined(PERL_SCNfldbl) -# define Perl_atof PERL_SCNfldbl -# define Perl_atof2(s,f) sscanf((s), "%"PERL_SCNfldbl, &(f)) -# endif -#endif -#if !defined(Perl_atof) -# define Perl_atof atof /* we assume atof being available anywhere */ -#endif -#if !defined(Perl_atof2) -# define Perl_atof2(s,f) ((f) = (NV)Perl_atof(s)) -#endif +#define Perl_atof(s) Perl_my_atof(s) +#define Perl_atof2(s, np) Perl_my_atof2(s, &(np)) /* Previously these definitions used hardcoded figures. * It is hoped these formula are more portable, although --- util.c.old Thu May 31 13:43:03 2001 +++ util.c Thu May 31 15:50:57 2001 @@ -4033,6 +4033,63 @@ return x; } +char* +Perl_my_atof2(pTHX_ const char* orig, NV* value) +{ + NV result = 0.0; + bool negative = 0; + char* s = (char*)orig; + + switch (*s) { + case '-': + negative = 1; + /* fall through */ + case '+': + ++s; + } + while (isDIGIT(*s)) + result = result * 10 + (*s++ - '0'); + if (*s == '.') { + NV decimal = 1.0; + ++s; + while (isDIGIT(*s)) { + decimal /= 10; + result += (*s++ - '0') * decimal; + } + } + if (*s == 'e' || *s == 'E') { + I32 exponent = 0; + I32 expnegative = 0; + I32 bit; + NV power; + + ++s; + switch (*s) { + case '-': + expnegative = 1; + /* fall through */ + case '+': + ++s; + } + while (isDIGIT(*s)) + exponent = exponent * 10 + (*s++ - '0'); + + /* now apply the exponent */ + power = (expnegative) ? 0.1 : 10.0; + for (bit = 1; exponent; bit <<= 1) { + if (exponent & bit) { + exponent ^= bit; + result *= power; + } + power *= power; + } + } + if (negative) + result = -result; + *value = result; + return s; +} + void Perl_report_evil_fh(pTHX_ GV *gv, IO *io, I32 op) { ```
p5pRT commented 22 years ago

From @simoncozens

Do we actually want this? It'd slow numeric conversion down horribly\, and numeric conversion's pretty hot.

p5pRT commented 22 years ago

From @vanstyn

I assume we need a mechanism capable of parsing a number out of a unicode string\, but I don't think we need to expand the normal concept of what a digit is\, so it shouldn't be too bad.

Hugo

p5pRT commented 22 years ago

From @rspier

I agree with Hugo\, but the portability problem remains.

Might something that does something similar to​:

  sub Atof {   my $_ = shift;   s/^\s*(\d+(?​:\.\d+)).*$/$1/;   # $_ is now _only_ digits.   return strtod($_);   }

be a happy solution?

-R

p5pRT commented 22 years ago

From @simoncozens

I'm reasonably sure the code in your patch was Unicode-happy. I'll have to check\, though.

p5pRT commented 22 years ago

From @nwc10

I like that idea better.

Attached is a start\, which passes all but pragma/locale.t here. I think it needs the following changes​: - overflow/underflow checking - locale support - ? unicode support - inf/nan support - whitespace tolerance (not sure what is permitted) - (optional) rewrite 'apply the exponent' to be faster - add tests​: it is surprising that only locale of the above missing features caused any test failures

There wasn't a test for 1 + 1 == 2 until recently. The first test of converting a string containing a negative floating point value to a number isn't until lib/storable-something.

+char* +Perl_my_atof2(pTHX_ const char* orig\, NV* value) +{ + NV result = 0.0; + bool negative = 0; + char* s = (char*)orig; + + switch (*s) { + case '-'​: + negative = 1; + /* fall through */ + case '+'​: + ++s; + } + while (isDIGIT(*s)) + result = result * 10 + (*s++ - '0'); + if (*s == '.') { + NV decimal = 1.0; + ++s; + while (isDIGIT(*s)) { + decimal /= 10; + result += (*s++ - '0') * decimal; + } + } + if (*s == 'e' || *s == 'E') { + I32 exponent = 0; + I32 expnegative = 0; + I32 bit; + NV power; + + ++s; + switch (*s) { + case '-'​: + expnegative = 1; + /* fall through */ + case '+'​: + ++s; + } + while (isDIGIT(*s)) + exponent = exponent * 10 + (*s++ - '0'); + + /* now apply the exponent */ + power = (expnegative) ? 0.1 : 10.0; + for (bit = 1; exponent; bit \<\<= 1) { + if (exponent & bit) { + exponent ^= bit; + result *= power; + } + power *= power; + } + } + if (negative) + result = -result; + *value = result; + return s; +}

Do we really want to do this? IIRC Nick I-S recently posted the descriptions of 2 papers here\, about how to do floating point conversion (to and from) accurately. I suspect that there are several subtle pitfalls. (Yes. This is arm waving. I've not read them. Please shut me up if I'm wrong)

Nicholas Clark

p5pRT commented 22 years ago

From @vanstyn

Attached is a continuation\, which passes all current tests here.

:I think it needs the following changes​: :- overflow/underflow checking

Not done; I don't know what is needed here.

:- locale support

Done\, I think - we have support for locale 'decimal_point'\, but not for 'thousands_sep'. I suspect we should also be allowing some underscores in there\, but we don't at the moment.

:- ? unicode support

I think this is only a problem if a) the locale-specific decimal point is not utf8 invariant and b) the string isn't using the same encoding as the decimal point string. I only know of "." and "\," dp strings; what others are there?

:- inf/nan support :- whitespace tolerance (not sure what is permitted)

Not done\, I don't know what is needed here.

:- (optional) rewrite 'apply the exponent' to be faster

The current algorithm is acceptable but not fast​: we do one fp multiply per bit set in the absolute value of the exponent\, plus one per bit (set or clear) up to the most significant set bit; for 6.02E23 we'll do 5 fp and two integer multiplies while parsing the digits\, and 9 more fp multiplies while applying the exponent.[0] We could perhaps gain some speed by having a table of 10**(2**n) for all relevant n; I also don't know how much accuracy we may be losing through the repeated mutiplies in both the decimal places and the exponent.

:- add tests​: it is surprising that only locale of the above missing features : caused any test failures

I think we need tests for bad numbers like "+23E" and "+E23" for example.

:I'd suggest also that Perl_atof2 be redefined - it is confusing having :a macro that converts a variable reference to its address without it :being named so as obviously to be a macro. Since the only references :to it are in the Perl_my_atof function\, I think this should be safe.

Done this.

:This patch requires a 'make regen_headers'.

This patch replaces the last one\, and still requires a 'make regen_headers'.

Hugo [0] My Maths dictionary doesn't have Avogadro's number; my English dictionary has an entry for it that _doesn't say what its value is_. :(

Inline Patch ```diff --- embed.pl.old Sun May 27 00:32:32 2001 +++ embed.pl Thu May 31 15:28:58 2001 @@ -2592,3 +2592,4 @@ Apd |STRLEN |sv_utf8_upgrade_flags|SV *sv|I32 flags Apd |char* |sv_pvn_force_flags|SV* sv|STRLEN* lp|I32 flags Ap |char* |sv_2pv_flags |SV* sv|STRLEN* lp|I32 flags +Ap |char* |my_atof2 |const char *s|NV* value --- perl.h.old Thu May 31 02:17:32 2001 +++ perl.h Thu May 31 20:23:23 2001 @@ -1311,24 +1311,8 @@ # endif #endif -#if !defined(Perl_atof) && defined(USE_LONG_DOUBLE) && defined(HAS_LONG_DOUBLE) -# if !defined(Perl_atof) && defined(HAS_STRTOLD) -# define Perl_atof(s) (NV)strtold(s, (char**)NULL) -# endif -# if !defined(Perl_atof) && defined(HAS_ATOLF) -# define Perl_atof (NV)atolf -# endif -# if !defined(Perl_atof) && defined(PERL_SCNfldbl) -# define Perl_atof PERL_SCNfldbl -# define Perl_atof2(s,f) sscanf((s), "%"PERL_SCNfldbl, &(f)) -# endif -#endif -#if !defined(Perl_atof) -# define Perl_atof atof /* we assume atof being available anywhere */ -#endif -#if !defined(Perl_atof2) -# define Perl_atof2(s,f) ((f) = (NV)Perl_atof(s)) -#endif +#define Perl_atof(s) Perl_my_atof(s) +#define Perl_atof2(s, np) Perl_my_atof2(s, np) /* Previously these definitions used hardcoded figures. * It is hoped these formula are more portable, although --- util.c.old Thu May 31 13:43:03 2001 +++ util.c Thu May 31 20:43:29 2001 @@ -4018,19 +4018,86 @@ if ((PL_hints & HINT_LOCALE) && PL_numeric_local) { NV y; - Perl_atof2(s, x); + Perl_atof2(s, &x); SET_NUMERIC_STANDARD(); - Perl_atof2(s, y); + Perl_atof2(s, &y); SET_NUMERIC_LOCAL(); if ((y < 0.0 && y < x) || (y > 0.0 && y > x)) return y; } else - Perl_atof2(s, x); + Perl_atof2(s, &x); #else - Perl_atof2(s, x); + Perl_atof2(s, &x); #endif return x; +} + +char* +Perl_my_atof2(pTHX_ const char* orig, NV* value) +{ + NV result = 0.0; + bool negative = 0; + char* s = (char*)orig; + char* point = "."; /* locale-dependent decimal point equivalent */ + STRLEN pointlen = 1; + bool seendigit = 0; + + if (PL_numeric_radix_sv) + point = SvPV(PL_numeric_radix_sv, pointlen); + + switch (*s) { + case '-': + negative = 1; + /* fall through */ + case '+': + ++s; + } + while (isDIGIT(*s)) { + result = result * 10 + (*s++ - '0'); + seendigit = 1; + } + if (memEQ(s, point, pointlen)) { + NV decimal = 0.1; + + s += pointlen; + while (isDIGIT(*s)) { + result += (*s++ - '0') * decimal; + decimal *= 0.1; + seendigit = 1; + } + } + if (seendigit && *s == 'e' || *s == 'E') { + I32 exponent = 0; + I32 expnegative = 0; + I32 bit; + NV power; + + ++s; + switch (*s) { + case '-': + expnegative = 1; + /* fall through */ + case '+': + ++s; + } + while (isDIGIT(*s)) + exponent = exponent * 10 + (*s++ - '0'); + + /* now apply the exponent */ + power = (expnegative) ? 0.1 : 10.0; + for (bit = 1; exponent; bit <<= 1) { + if (exponent & bit) { + exponent ^= bit; + result *= power; + } + power *= power; + } + } + if (negative) + result = -result; + *value = result; + return s; } void ```
p5pRT commented 22 years ago

From @vanstyn

I think so\, but it's always possible that we don't want to do it in exactly this way. An alternative approach is to apply only the 'valid number' logic\, copying into a new string along the way\, and then calling atof() on the new string. However I think we have a chance of making our own version right\, and I know we have no chance of that with every operating system's own versions.

:IIRC Nick I-S recently posted the descriptions of 2 papers here\, about how :to do floating point conversion (to and from) accurately. I suspect that :there are several subtle pitfalls. (Yes. This is arm waving. I've not read :them. Please shut me up if I'm wrong)

There are several obvious pitfalls\, which I'm hoping someone else will sort out - checking for over and underflow\, and loss of accuracy due to repeated multiplications and additions.

To be honest\, I think we should ignore underflow - if the user gets zero\, that's accurate enough by my lights; overflow we can check for easily enough\, at the cost of a bit of cruft. Accuracy will be the bugbear\, and I think we can get most of the way there with a couple of precalculated tables. I'm not sure how we ensure​:   print fp_really_equal(1e-300\, "." . ("0" x 299) . "1") ? "ok" : "not ok"; .. unless we have 300 entries in the precalculated table.

I guess there's also some potential for accumulating error because we add a separate component to the total for each decimal place\, but I'd expect the error there to be within tolerance; if not\, we could look to combine the digit string into a single number\, then combine the decimals-offset with the exponent to multiply/divide in one go. Having access to big enough integers for the mantissa (ignoring the decimal point) would help a lot with that. I think fp-to-string is much nastier\, or I'd have removed the remaining C-library sprintf call a long time ago.

I hope that the patch will be applied\, and that we'll then get some people who really need this stuff coming up with test cases for us to aim for.

Hugo

p5pRT commented 22 years ago

From @jhi

Yup. It's one Unicode character but two bytes.

Unicode book). There might be others\, however.

See my other message about other possible digits\, however (who knows whether tb_CN (or whatever) locale might not define Tibetan digits as being isdigit()?).

Cheers\, Philip -- Philip Newton \Philip\.Newton@&#8203;gmx\.net

p5pRT commented 22 years ago

From @jhi

Do you mean

  if (seendigit && (*s == 'e' || *s == 'E')) {

If so\, the -Wall is good for something :-)

p5pRT commented 22 years ago

From @jhi

Assuming that the separator is only one byte is wrong. See util.c​:Perl_set_numeric_radix().

p5pRT commented 22 years ago

From @jhi

The patch applied now\, however.

p5pRT commented 22 years ago

From @vanstyn

Yes.

:If so\, the -Wall is good for something :-)

Yes.

Hugo

p5pRT commented 22 years ago

From @vanstyn

I don't\, do I?

:See util.c​:Perl_set_numeric_radix().

I did​:

  point = SvPV(PL_numeric_radix_sv\, pointlen); [...]   if (memEQ(s\, point\, pointlen)) {

Hugo

p5pRT commented 22 years ago

From @jhi

That's good\, then.

Hugo

p5pRT commented 22 years ago

From @jhi

Hmmm. This patch has a strange effect​: the posix.t #14 (strtod) starts failing in tru64/alpha/cc\, tru64/alpha/gcc\, and solaris/sparc/gcc-- but not in linux/x86/gcc (this effectively means\, at least in the platforms I routinely use\, that the ones that had Atof() that returned 3 for "0x3" started failing\, and the one platform that was wrong (Linux\, is working correctly for the strtod test...) The failure is bizarre in that if I use Devel​::Peek​::Dump() on the $n in the test\, it correctly is NV of 3.14159 (and $x is 6). So is it the == that got broken...?

p5pRT commented 22 years ago

From @vanstyn

I think you need to dump not $n but the literal; try something like​:   $pi = 3.14159;   unless ($n == $pi && $x == 6) {   Dump($n); Dump($pi);   print "not ";   }

Hugo

p5pRT commented 22 years ago

From @jhi

Inline Patch ```diff --- t/lib/posix.t Sat Jun 2 20:11:44 2001 +++ p.t Sat Jun 2 22:29:40 2001 @@ -79,6 +79,9 @@ if ($Config{d_strtod}) { $lc = &POSIX::setlocale(&POSIX::LC_NUMERIC, 'C') if $Config{d_setlocale}; ($n, $x) = &POSIX::strtod('3.14159_OR_SO'); + use Devel::Peek; + $pi = 3.14159; + Dump($pi); Dump($n); Dump($x); # Using long double NVs may introduce greater accuracy than wanted. $n =~ s/^3.14158999\d*$/3.14159/ if $Config{uselongdouble} eq 'define'; ```

ok 13 SV = NV(0x127fb8) at 0x1c0d54   REFCNT = 1   FLAGS = (NOK\,pNOK)   NV = 3.14159 SV = NV(0x127fc8) at 0x1c0d00   REFCNT = 1   FLAGS = (NOK\,pNOK)   NV = 3.14159 SV = IV(0x18c71c) at 0x1c0d18   REFCNT = 1   FLAGS = (IOK\,pIOK)   IV = 6 not ok 14

This on the solaris/sparc/gcc.

akvavitix​:/tmp/jhi/perl ; ./perl -Ilib -MPOSIX=strtod -MDevel​::Peek -wle '$n=strtod("3.14159");Dump($n);print "ok" if $n == 3.14159;print $n - 3.14159' SV = NV(0x124728) at 0x10b018   REFCNT = 1   FLAGS = (NOK\,pNOK)   NV = 3.14159 -4.44089209850063e-16 akvavitix​:/tmp/jhi/perl ;

With 5.6.0 (don't have 5.6.1 compiled)​:

akvavitix​:/tmp/jhi/perl ; /p/bin/perl -MPOSIX=strtod -MDevel​::Peek -wle '$n=strtod("3.14159");Dump($n);print "ok" if $n == 3.14159;print $n - 3.14159' SV = NV(0x10bc58) at 0x150fc8   REFCNT = 1   FLAGS = (NOK\,pNOK)   NV = 3.14159 ok 0 akvavitix​:/tmp/jhi/perl ;

p5pRT commented 22 years ago

From @jhi

Also an interesting question is why POSIX​::strtod()?

p5pRT commented 22 years ago

From @vanstyn

Presumably\, because that is what's doing the same thing more accurately.

Hugo

p5pRT commented 22 years ago

From @vanstyn

Could you show\, say\, a %.32f of $n and of 3.14159 please?

Hugo

p5pRT commented 22 years ago

From @vanstyn

For example\, this is what I get here​: crypt% ./perl -wle 'printf "%.32f\n"\, 3.14159+$_ for 0\, 2e-16\, 3e-16;use POSIX;printf "%.32f\n"\, POSIX​::strtod("3.14159")+$_ for 0\, 2e-16\, 3e-16' 3.14158999999999988261834005243145 3.14158999999999988261834005243145 3.14159000000000032670754990249407 3.14158999999999988261834005243145 3.14158999999999988261834005243145 3.14159000000000032670754990249407 crypt%

Hugo

p5pRT commented 22 years ago

From @vanstyn

Ok\, I've now built with -Duselongdouble\, and I have this​: crypt% ./perl -Ilib -wle 'use POSIX; printf "%.32f\n"\,$_ for 3.14159\, scalar POSIX​::strtod("3.14159")' 3.14159000000000001033735597122387 3.14158999999999988261834005243145 crypt%

.. which suggests to me that the tests are failing because the new routine is way _more_ accurate than the C library strtod. That's not unreasonable\, because strtod is not expected to work to long double accuracy\, but it implies that (some at least of) the tests are wrong. I'll start trying to fix them\, and see what else is left. I think I will need to improve the decimal handling\, because being only 10 times more accurate than strtod probably isn't good enough.

Hugo

p5pRT commented 22 years ago

From @jhi

akvavitix​:/tmp/jhi/perl ; ./perl -Ilib -MPOSIX=strtod -MDevel​::Peek -wle '$n=strtod("3.14159");Dump($n);printf "%.32f\n"\, $n;printf "%.32f\n"\, 3.14159' SV = NV(0x124720) at 0x10b018   REFCNT = 1   FLAGS = (NOK\,pNOK)   NV = 3.14159 3.14158999999999988261834005243145 3.14159000000000032670754990249407 akvavitix​:/tmp/jhi/perl ; ./perl -Ilib -wle 'printf "%.32f\n"\, 3.14159+$_ for 0\, 2e-16\, 3e-16;use POSIX;printf "%.32f\n"\, POSIX​::strtod("3.14159")+$_ for 0\, 2e-16\, 3e-16' 3.14159000000000032670754990249407 3.14159000000000032670754990249407 3.14159000000000077079675975255668 3.14158999999999988261834005243145 3.14158999999999988261834005243145 3.14159000000000032670754990249407 akvavitix​:/tmp/jhi/perl ; ./myconfig Summary of my perl5 (revision 5.0 version 7 subversion 1 patchlevel 10380) configuration​:   Platform​:   osname=solaris\, osvers=2.7\, archname=sun4-solaris   uname='sunos akvavitix.hut.fi 5.7 generic_106541-05 sun4u sparc '   config_args='-des -Dusedevel -Dcc=gcc'   hint=recommended\, useposix=true\, d_sigaction=define   usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef   useperlio=define d_sfio=undef uselargefiles=define usesocks=undef   use64bitint=undef use64bitall=undef uselongdouble=undef   Compiler​:   cc='gcc'\, ccflags ='-fno-strict-aliasing -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'\,   optimize='-O'\,   cppflags='-Wall -fno-strict-aliasing'   ccversion=''\, gccversion='2.95.2 19991024 (release)'\, gccosandvers='solaris2.7'   intsize=4\, longsize=4\, ptrsize=4\, doublesize=8\, byteorder=4321   d_longlong=define\, longlongsize=8\, d_longdbl=define\, longdblsize=16   ivtype='long'\, ivsize=4\, nvtype='double'\, nvsize=8\, Off_t='off_t'\, lseeksize=8   alignbytes=8\, usemymalloc=n\, prototype=define   Linker and Libraries​:   ld='gcc'\, ldflags =' '   libpth=/usr/lib /usr/ccs/lib   libs=-lsocket -lnsl -ldl -lm -lc   perllibs=-lsocket -lnsl -ldl -lm -lc   libc=/lib/libc.so\, so=so\, useshrplib=false\, libperl=libperl.a   Dynamic Linking​:   dlsrc=dl_dlopen.xs\, dlext=so\, d_dlsymun=undef\, ccdlflags=' '   cccdlflags='-fPIC'\, lddlflags='-G'

akvavitix​:/tmp/jhi/perl ;

Hugo

p5pRT commented 22 years ago

From @vanstyn

Not sure which config that is\, but I'm a little surprised that's so far out. I'll work on trying to improve atof2()\, but I suspect this is going to prove as troublesome a test candidate as any of the _ld platforms.

Hugo

p5pRT commented 22 years ago

From @vanstyn

Third attempt; this passes all tests here with i64 + ld\, and I'm working on testing the other combinations.

I believe gcc gives me a 64-bit mantisssa for long doubles; I'd be particularly interested in test results from anyone that has greater precision available to them.

Hugo

Inline Patch ```diff --- util.c.old Thu May 31 20:43:29 2001 +++ util.c.new Sun Jun 3 14:49:25 2001 @@ -4033,6 +4033,35 @@ return x; } +NV +S_mulexp10(NV value, I32 exponent) +{ + NV result = value; + NV power = 10.0; + I32 bit; + + if (exponent > 0) { + for (bit = 1; exponent; bit <<= 1) { + if (exponent & bit) { + exponent ^= bit; + result *= power; + } + power *= power; + } + } + else if (exponent < 0) { + exponent = -exponent; + for (bit = 1; exponent; bit <<= 1) { + if (exponent & bit) { + exponent ^= bit; + result /= power; + } + power *= power; + } + } + return result; +} + char* Perl_my_atof2(pTHX_ const char* orig, NV* value) { @@ -4042,10 +4071,26 @@ char* point = "."; /* locale-dependent decimal point equivalent */ STRLEN pointlen = 1; bool seendigit = 0; + I32 expextra = 0; + I32 exponent = 0; + I32 i; +/* this is arbitrary */ +#define PARTLIM 6 +/* we want the largest integers we can usefully use */ +#if defined(HAS_QUAD) && defined(USE_64_BIT_INT) +# define PARTSIZE ((int)TYPE_DIGITS(U64)-1) + U64 part[PARTLIM]; +#else +# define PARTSIZE ((int)TYPE_DIGITS(U32)-1) + U32 part[PARTLIM]; +#endif + I32 ipart = 0; /* index into part[] */ + I32 offcount; /* number of digits in least significant part */ if (PL_numeric_radix_sv) point = SvPV(PL_numeric_radix_sv, pointlen); + /* sign */ switch (*s) { case '-': negative = 1; @@ -4053,23 +4098,77 @@ case '+': ++s; } + + part[0] = offcount = 0; + if (isDIGIT(*s)) { + seendigit = 1; /* get this over with */ + + /* skip leading zeros */ + while (*s == '0') + ++s; + } + + /* integer digits */ while (isDIGIT(*s)) { - result = result * 10 + (*s++ - '0'); - seendigit = 1; + if (++offcount > PARTSIZE) { + if (++ipart < PARTLIM) { + part[ipart] = 0; + offcount = 1; /* ++0 */ + } + else { + /* limits of precision reached */ + --ipart; + --offcount; + if (*s >= '5') + ++part[ipart]; + while (isDIGIT(*s)) { + ++expextra; + ++s; + } + /* warn of loss of precision? */ + break; + } + } + part[ipart] = part[ipart] * 10 + (*s++ - '0'); } - if (memEQ(s, point, pointlen)) { - NV decimal = 0.1; + /* decimal point */ + if (memEQ(s, point, pointlen)) { s += pointlen; + if (isDIGIT(*s)) + seendigit = 1; /* get this over with */ + + /* decimal digits */ while (isDIGIT(*s)) { - result += (*s++ - '0') * decimal; - decimal *= 0.1; - seendigit = 1; + if (++offcount > PARTSIZE) { + if (++ipart < PARTLIM) { + part[ipart] = 0; + offcount = 1; /* ++0 */ + } + else { + /* limits of precision reached */ + --ipart; + --offcount; + if (*s >= '5') + ++part[ipart]; + while (isDIGIT(*s)) + ++s; + /* warn of loss of precision? */ + break; + } + } + --expextra; + part[ipart] = part[ipart] * 10 + (*s++ - '0'); } } + + /* combine components of mantissa */ + for (i = 0; i <= ipart; ++i) + result += S_mulexp10((NV)part[ipart - i], + i ? offcount + (i - 1) * PARTSIZE : 0); + if (seendigit && *s == 'e' || *s == 'E') { - I32 exponent = 0; - I32 expnegative = 0; + bool expnegative = 0; I32 bit; NV power; @@ -4083,17 +4182,15 @@ } while (isDIGIT(*s)) exponent = exponent * 10 + (*s++ - '0'); - - /* now apply the exponent */ - power = (expnegative) ? 0.1 : 10.0; - for (bit = 1; exponent; bit <<= 1) { - if (exponent & bit) { - exponent ^= bit; - result *= power; - } - power *= power; - } + if (expnegative) + exponent = -exponent; } + + /* now apply the exponent */ + exponent += expextra; + result = S_mulexp10(result, exponent); + + /* now apply the sign */ if (negative) result = -result; *value = result; ```
p5pRT commented 22 years ago

From @vanstyn

i32 + ld (which I've never tried before) gives me one test failure\, which I don't think is my fault​: crypt% ( cd t ; op/int.t ) 1..14 ok 1 ok 2 ok 3 ok 4 ok 5 ok 6 ok 7 not ok 8 # int(4294967303.15) is 4294967302\, not 4294967303 not ok 9 # int(4294967303.15) is -4294967302\, not -4294967303 ok 10 ok 11 ok 12 ok 13 ok 14 crypt%

Nick\, could this be related to your numeric conversion work?

Hugo

p5pRT commented 22 years ago

From @nwc10

Not directly. It looks like ID 20010118.017\, which Abigail found was failing on released 5.7.0. [Jarkko added a patch of mine to beef up the tests in op/int.t sometime after 5.7.0 which appears to have turned up a bug in some x86 linux systems.]

Quote Abigail from \20010119203420\.828\.qmail@&#8203;foad\.org​:

Same errors with the released 5.7.0.

$ ./perl -I../lib ~abigail/Src/perl/t/op/int.t 1..14 ok 1 ok 2 ok 3 ok 4 ok 5 ok 6 ok 7 not ok 8 # int(4294967303.15) is 4294967302\, not 4294967303 not ok 9 # int(4294967303.15) is -4294967302\, not -4294967303 ok 10 ok 11 ok 12 ok 13 ok 14

So it seems that you are able to re-create bug 20010118.017. I wasn't. Not sure where to proceed from here - I think it's a gcc or libc issue\, but as I couldn't make that test fail on long doubles I couldn't track it further.

Nicholas Clark

p5pRT commented 22 years ago

From @jhi

At the moment I'm not using -ld at all.

out. I'll work on trying to improve atof2()\, but I suspect this is going to prove as troublesome a test candidate as any of the _ld platforms.

Hugo

p5pRT commented 22 years ago

From @vanstyn

Ok\, I'll try to come back to this one later.

Hugo

p5pRT commented 22 years ago

From @vanstyn

Whoops​: : if (seendigit && *s == 'e' || *s == 'E') {

I forgot to add the parens here.

Hugo

p5pRT commented 22 years ago

From @jhi

Even with the above\, the "3rd try"\, and the handy.h patch I still get in solaris/sparc/gcc​:

akvavitix​:/tmp/jhi/perl ; ./perl -Ilib -MPOSIX=strtod -MDevel​::Peek -wle '$n=strtod("3.14159");Dump($n);printf "%.32f\n"\, $n;printf "%.32f\n"\, 3.14159' SV = NV(0x124dd0) at 0x10b2b0   REFCNT = 1   FLAGS = (NOK\,pNOK)   NV = 3.14159 3.14158999999999988261834005243145 3.14159000000000032670754990249407 akvavitix​:/tmp/jhi/perl ; ./perl -Ilib -wle 'printf "%.32f\n"\, 3.14159+$_ for 0\, 2e-16\, 3e-16;use POSIX;printf "%.32f\n"\, POSIX​::strtod("3.14159")+$_ for 0\, 2e-16\, 3e-16' 3.14159000000000032670754990249407 3.14159000000000032670754990249407 3.14159000000000077079675975255668 3.14158999999999988261834005243145 3.14158999999999988261834005243145 3.14159000000000032670754990249407

and the t/lib/posix.t #14 failing. I'm not using long doubles.

Hugo

p5pRT commented 22 years ago

From @jhi

Change 10408 by jhi@​alpha on 2001/06/03 16​:52​:41

  Tweak the test to be happy if the accuracy is 'good enough'.

Affected files ...

... //depot/perl/t/lib/posix.t#20 edit

Differences ...

==== //depot/perl/t/lib/posix.t#20 (xtext) ==== Index​: perl/t/lib/posix.t

Inline Patch ```diff --- perl/t/lib/posix.t.~1~ Sun Jun 3 20:52:46 2001 +++ perl/t/lib/posix.t Sun Jun 3 20:52:46 2001 @@ -82,7 +82,8 @@ # Using long double NVs may introduce greater accuracy than wanted. $n =~ s/^3.1415(8999|9000)\d*$/3.14159/ if $Config{uselongdouble} eq 'define'; - print (($n == 3.14159) && ($x == 6) ? "ok 14\n" : "not ok 14\n"); + print ((abs($n - 3.14159) < 0.00001) && ($x == 6) ? + "ok 14\n" : "not ok 14\n"); &POSIX::setlocale(&POSIX::LC_NUMERIC, $lc) if $Config{d_setlocale}; } else { print "# strtod not present\n", "ok 14\n"; } End of Patch. ```
p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

  ...

Here's a file-creation patch.   ... +++ Porting/gdbmacros Wed May 30 11​:26​:27 2001

Is there progress being made on this issue?

I've noticed that the patch hasn't been applied yet. (are there concerns with the patch?)

p5pRT commented 22 years ago

From @vanstyn

Ok\, I think this is happening because we calculate 314159/10/10000. Attached patch improves this to calculate instead 314159/(10*10000)\, which should give the more accurate result.

Hugo

Inline Patch ```diff --- util.c.old Mon Jun 4 00:46:14 2001 +++ util.c Tue Jun 5 12:51:11 2001 @@ -4036,30 +4036,25 @@ NV S_mulexp10(NV value, I32 exponent) { - NV result = value; + NV result = 1.0; NV power = 10.0; + bool negative = 0; I32 bit; - if (exponent > 0) { - for (bit = 1; exponent; bit <<= 1) { - if (exponent & bit) { - exponent ^= bit; - result *= power; - } - power *= power; - } - } + if (exponent == 0) + return value; else if (exponent < 0) { + negative = 1; exponent = -exponent; - for (bit = 1; exponent; bit <<= 1) { - if (exponent & bit) { - exponent ^= bit; - result /= power; - } - power *= power; + } + for (bit = 1; exponent; bit <<= 1) { + if (exponent & bit) { + exponent ^= bit; + result *= power; } + power *= power; } - return result; + return negative ? value / result : value * result; } char* ```
p5pRT commented 22 years ago

From @jhi

Doug MacEachern had issues with the proposed macros.

p5pRT commented 22 years ago

From @rspier

Inline Patch ```diff --- pod/perltodo.pod.orig Thu Sep 20 22:29:06 2001 +++ pod/perltodo.pod Thu Sep 20 22:42:50 2001 @@ -824,4 +824,16 @@ Collation? http://www.unicode.org/unicode/reports/tr10/ Normalization? http://www.unicode.org/unicode/reports/tr15/ +=head2 Create debugging macros + +Debugging macros (like printsv, dump) can make debugging perl inside a +C debugger much easier. A good set for gdb comes with mod_perl. +Something similar should be distributed with perl. + +The proper way to do this is to use and extend Devel::DebugInit. +Devel::DebugInit also needs to be extended to support threads. + +See p5p archives for late May/early June 2001 for a recent discussion +on this topic. + =cut ```
p5pRT commented 22 years ago

From @jhi

Thanks\, applied.

--- pod/perltodo.pod.orig Thu Sep 20 22​:29​:06 2001 +++ pod/perltodo.pod Thu Sep 20 22​:42​:50 2001 @​@​ -824\,4 +824\,16 @​@​ Collation? http​://www.unicode.org/unicode/reports/tr10/ Normalization? http​://www.unicode.org/unicode/reports/tr15/

+=head2 Create debugging macros + +Debugging macros (like printsv\, dump) can make debugging perl inside a +C debugger much easier. A good set for gdb comes with mod_perl. +Something similar should be distributed with perl. + +The proper way to do this is to use and extend Devel​::DebugInit. +Devel​::DebugInit also needs to be extended to support threads. + +See p5p archives for late May/early June 2001 for a recent discussion +on this topic. + =cut