Inconsistent math with large numbers

Perl / perl5

🐪 The Perl programming language

https://dev.perl.org/perl5/

Other

1.96k stars 555 forks source link

Inconsistent math with large numbers #9553

Open p5pRT opened 16 years ago

p5pRT commented 16 years ago

Migrated from rt.perl.org#60318 (status was 'open')

Searchable as RT60318$

p5pRT commented 16 years ago

From @andk

Created by @andk

CPAN testers have found 12 versions of perl that believe that X89.2 is larger than X90\, with X being 1234567891234567. This is not limited to use64bitint architectures and 64 bit boxes are not behaving better. Appending a .0 to the integer solves the bug on all architectures but this is considered unperlish at best.

Here are a few data points about the twelve perls. By prepending http://www.nntp.perl.org/group/perl.cpan.testers/ to the first column you get the full report to see further details.

2537148 meta:perl[5.10.0] conf:archname[x86_64-linux] conf:ivtype[long] 2537091 meta:perl[5.10.0@34437] conf:archname[i686-linux-64int] conf:ivtype[long long] 2537089 meta:perl[5.8.8] conf:archname[i686-linux-64int] conf:ivtype[long long] 2537074 meta:perl[5.8.7] conf:archname[i686-linux-thread-multi-64int] conf:ivtype[long long] 2537062 meta:perl[5.8.8@34468] conf:archname[i686-linux-64int] conf:ivtype[long long] 2537056 meta:perl[5.10.0] conf:archname[i686-linux-64int] conf:ivtype[long long] 2530368 meta:perl[5.8.8] conf:archname[cygwin-thread-multi-64int] conf:ivtype[long long] 2467838 meta:perl[5.8.8] conf:archname[IP35-irix-64int] conf:ivtype[long long] 2467689 meta:perl[5.8.8] conf:archname[i386-freebsd-64int] conf:ivtype[long long] 2466937 meta:perl[5.10.0] conf:archname[alpha-netbsd] conf:ivtype[long] 2461059 meta:perl[5.8.4] conf:archname[sun4-solaris-64int] conf:ivtype[long long] 2461018 meta:perl[5.10.0] conf:archname[i86pc-solaris-64int] conf:ivtype[long long]

Perl Info

``` Flags: category=core severity=medium Site configuration information for perl 5.10.0: Configured by sand at Tue Dec 18 18:56:35 CET 2007. Summary of my perl5 (revision 5 version 10 subversion 0) configuration: Platform: osname=linux, osvers=2.6.22-1-k7, archname=i686-linux-64int uname='linux k75 2.6.22-1-k7 #1 smp sun jul 29 15:15:55 utc 2007 i686 gnulinux ' config_args='-Dprefix=/home/src/perl/repoperls/installed-perls/perl/pVNtS9N/perl-5.8.0@32642 -Dinstallusrbinperl=n -Uversiononly -Dusedevel -des -Duse64bitint -Ui_db' hint=recommended, useposix=true, d_sigaction=define useithreads=undef, usemultiplicity=undef useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef use64bitint=define, use64bitall=undef, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2', cppflags='-fno-strict-aliasing -pipe -I/usr/local/include' ccversion='', gccversion='4.1.2 20061115 (prerelease) (Debian 4.1.1-21)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='cc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib /usr/lib64 libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc libc=/lib/libc-2.6.1.so, so=so, useshrplib=false, libperl=libperl.a gnulibc_version='2.6.1' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib' Locally applied patches: patchaperlup: --branch='perl' --upto='32642' --start='17639' @INC for perl 5.10.0: /home/src/perl/repoperls/installed-perls/perl/pVNtS9N/perl-5.8.0@32642/lib/5.10.0/i686-linux-64int /home/src/perl/repoperls/installed-perls/perl/pVNtS9N/perl-5.8.0@32642/lib/5.10.0 /home/src/perl/repoperls/installed-perls/perl/pVNtS9N/perl-5.8.0@32642/lib/site_perl/5.10.0/i686-linux-64int /home/src/perl/repoperls/installed-perls/perl/pVNtS9N/perl-5.8.0@32642/lib/site_perl/5.10.0 . Environment for perl 5.10.0: HOME=/home/k LANG=de_DE.UTF-8 LANGUAGE (unset) LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/usr/lib/ccache:/home/k/bin:/usr/local/bin:/usr/lib/ccache:/usr/bin:/bin:/usr/bin/X11:/usr/games:/usr/local/perl/bin:/usr/X11/bin:/sbin:/usr/sbin PERL_BADLANG (unset) SHELL=/usr/bin/zsh ```

p5pRT commented 11 years ago

From @jkeenan

On Mon Nov 03 00:04:35 2008\, andreas.koenig.7os6VVqR@franz.ak.mind.de wrote:

This is a bug report for perl from andreas.koenig.7os6VVqR@franz.ak.mind.de\, generated with the help of perlbug 1.36 running under perl 5.10.0.

----------------------------------------------------------------- [Please enter your report here] CPAN testers have found 12 versions of perl that believe that X89.2 is larger than X90\, with X being 1234567891234567. This is not limited to use64bitint architectures and 64 bit boxes are not behaving better. Appending a .0 to the integer solves the bug on all architectures but this is considered unperlish at best.

Here are a few data points about the twelve perls. By prepending http://www.nntp.perl.org/group/perl.cpan.testers/ to the first column you get the full report to see further details.

2537148 meta:perl[5.10.0] conf:archname[x86_64-linux] conf:ivtype[long]

Composing a link pursuant to these instructions took me to:

http://www.cpantesters.org/static/recent.html

How should I proceed from there?

Would it be possible to provide some examples directly in RT?

Thank you very much. Jim Keenan

p5pRT commented 11 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 11 years ago

From @tonycoz

On Sat\, Feb 16\, 2013 at 05:40:50PM -0800\, James E Keenan via RT wrote:

On Mon Nov 03 00:04:35 2008\, andreas.koenig.7os6VVqR@franz.ak.mind.de wrote:

This is a bug report for perl from andreas.koenig.7os6VVqR@franz.ak.mind.de\, generated with the help of perlbug 1.36 running under perl 5.10.0.

----------------------------------------------------------------- [Please enter your report here] CPAN testers have found 12 versions of perl that believe that X89.2 is larger than X90\, with X being 1234567891234567. This is not limited to use64bitint architectures and 64 bit boxes are not behaving better. Appending a .0 to the integer solves the bug on all architectures but this is considered unperlish at best.

Here are a few data points about the twelve perls. By prepending http://www.nntp.perl.org/group/perl.cpan.testers/ to the first column you get the full report to see further details.

2537148 meta:perl[5.10.0] conf:archname[x86_64-linux] conf:ivtype[long]

Composing a link pursuant to these instructions took me to:

http://www.cpantesters.org/static/recent.html

How should I proceed from there?

They can be found with a

http://www.cpantesters.org/cpan/report/

prefix\, so the first is:

http://www.cpantesters.org/cpan/report/2537148

Newer reports can be found at:

http://www.cpantesters.org/distro/A/Acme-Study-Perl.html

(The inconsistencies don't cause test failures)

Tony

p5pRT commented 11 years ago

From @bulk88

On Mon Nov 03 00:04:35 2008\, andreas.koenig.7os6VVqR@franz.ak.mind.de wrote:

This is a bug report for perl from andreas.koenig.7os6VVqR@franz.ak.mind.de\, generated with the help of perlbug 1.36 running under perl 5.10.0.

----------------------------------------------------------------- [Please enter your report here] CPAN testers have found 12 versions of perl that believe that X89.2 is larger than X90\, with X being 1234567891234567. This is not limited to use64bitint architectures and 64 bit boxes are not behaving better. Appending a .0 to the integer solves the bug on all architectures but this is considered unperlish at best.

cygperl 5.14.2 32 bit with 64 IVs ___________________________________________ Administrator@dl585 ~ $ perl -e "print '123456789123456789.2' > '123456789123456790'" 1 ___________________________________________ osname=MSWin32\, osvers=5.1\, archname=MSWin32-x86-multi-thread uname='Win32 strawberryperl 5.12.3.0 #1 Sun May 15 09:44:53 2011 i386' ___________________________________________ C:\>perl -e "print '123456789123456789.2' > '123456789123456790'"

C:\> ___________________________________________ Visual C x64 build Summary of my perl5 (revision 5 version 17 subversion 9 patch blead 2013-02-16.1 8:14:53 2f8114fb08248fa8661a45c7e473b59c7e633458 v5.17.8-169-g2f8114f) configura tion: Snapshot of: 2f8114fb08248fa8661a45c7e473b59c7e633458 Platform: osname=MSWin32\, osvers=5.2\, archname=MSWin32-x64-multi-thread ___________________________________________ C:\p517\o1\bin>perl -e "print '123456789123456789.2' > '123456789123456790'" 1 C:\p517\o1\bin>perl -V ___________________________________________ Summary of my perl5 (revision 5 version 10 subversion 0) configuration: Platform: osname=MSWin32\, osvers=5.00\, archname=MSWin32-x86-multi-thread ActivePerl Build 1003 [285500] ___________________________________________ C:\>perl -e "print '123456789123456789.2' > '123456789123456790'"

C:\> ___________________________________________ VC 32 build Summary of my perl5 (revision 5 version 17 subversion 7 patch blead 2012-12-06.1 6:42:20 93a641ae382638ffd1980378be4810244d04f4b0 v5.17.6-186-g93a641a) configura tion: Snapshot of: 93a641ae382638ffd1980378be4810244d04f4b0 Platform: osname=MSWin32\, osvers=5.1\, archname=MSWin32-x86-multi-thread

___________________________________________ C:\>perl -e "print '123456789123456789.2' > '123456789123456790'"

C:\> ___________________________________________

my conclusion\, only happens if 64 bit IVs\, regardless of ptr size

-- bulk88 ~ bulk88 at hotmail.com

p5pRT commented 11 years ago

From @bulk88

On Sat Feb 16 19:00:16 2013\, bulk88 wrote:

my conclusion\, only happens if 64 bit IVs\, regardless of ptr size

On a x64 perl\, asm CVTSI2SD ("Convert one signed quadword integer from r/m64 to one double precision floating-point value in xmm.") instruction at http://perl5.git.perl.org/perl.git/blob/480c67241f1595db244990ae2327660f1eec3602:/sv.c#l2523 was used to convert

+ xiv_u {xivu_iv=123456789123456790 xivu_uv=123456789123456790 xivu_namehek=0x01b69b4bacd05f16 } _xivu

from 64b int to NV. The result of CVTSI2SD was

XMM0DL = +1.23456789123457E+017
XMM0 = 0000000000000000437B69B4BACD05F1

lnv took the macro shortcut since it already was an NV\, and didn't call 2nv at http://perl5.git.perl.org/perl.git/blob/480c67241f1595db244990ae2327660f1eec3602:/pp.c#l2035

lnv as I64 is 0x437b69b4bacd05f2

SV * left's nv member is

+ xnv_u {xnv_nv=1.2345678912345680e+017 xgv_stash=0x437b69b4bacd05f2 xpad_cop_seq={...} ...} _xnvu

sample script was _______________________________ $m = 123456789123456789.2; $m2 = 123456789123456790; print $m > $m2; ________________________________

its FP round off\, this is confirmed with "log(123 456 789 123 456 790) / log(2) = 56.7767838". 56 > 53 (limit for storing an int in a double without precision loss).

I am not sure what can be done here. If one side of the script cmp IV is over 2^53 and the other is an NV\, do the comparison as an IV cmp not a NV cmp?

-- bulk88 ~ bulk88 at hotmail.com

p5pRT commented 11 years ago

From @arc

bulk88 via RT \perlbug\-followup@perl\.org wrote:

I am not sure what can be done here. If one side of the script cmp IV is over 2^53 and the other is an NV\, do the comparison as an IV cmp not a NV cmp?

I don't think that would help in every case of this sort.

my $iv = 123456789123456784; my $nv = 123456789123456784.1; ok($nv > $iv);

Mathematically\, that ought to be true\, but neither converting $iv to an NV nor $nv to a (64-bit) IV will make the comparison yield the intended result.

I think the answer is that you have to use something like bignum or bigrat if you want precise non-integer arithmetic on large numbers (or precise integer arithmetic on very large numbers). I favour resolving this ticket.

-- Aaron Crane ** http://aaroncrane.co.uk/

p5pRT commented 11 years ago

From @andk

"Aaron Crane via RT" \perlbug\-followup@perl\.org writes:

bulk88 via RT \perlbug\-followup@perl\.org wrote:

I am not sure what can be done here. If one side of the script cmp IV is over 2^53 and the other is an NV\, do the comparison as an IV cmp not a NV cmp?

I don't think that would help in every case of this sort.

my $iv = 123456789123456784; my $nv = 123456789123456784.1; ok($nv > $iv);

Mathematically\, that ought to be true\, but neither converting $iv to an NV nor $nv to a (64-bit) IV will make the comparison yield the intended result.

I think the answer is that you have to use something like bignum or bigrat if you want precise non-integer arithmetic on large numbers (or precise integer arithmetic on very large numbers). I favour resolving this ticket.

I'm sorry\, I can't follow you. The example in the original ticket is quite fundamentally different from yours. Can you be more specific what you believe is going wrong or not going wrong before you jump to conclusions?

-- andreas

p5pRT commented 11 years ago

From @arc

Andreas Koenig \andreas\.koenig\.7os6VVqR@franz\.ak\.mind\.de wrote:

I'm sorry\, I can't follow you. The example in the original ticket is quite fundamentally different from yours. Can you be more specific what you believe is going wrong or not going wrong before you jump to conclusions?

Andreas\, I apologise; I picked up the ">" example from bulk88's message. I should have read Acme::Study::Perl and the relevant CPAN Testers reports more carefully.

AIUI\, the first example in t/studyperl.t for A::S::P 0.0.2 ultimately does this:

eval q/ "123456789123456789.2" \<=> "123456789123456790" /;

As you point out\, actually running that (on at least some Perls) produces a positive return value for this three-way comparison\, saying that it considers X+89.2 > X+90 where X == 123456789123456700.

While this is mathematically unexpected\, I don't think it represents a bug in Perl. The eighteen-digit numbers in question are too big to have a precise representation in a typical 64-bit floating-point value (since those have a precision limit of 53 bits\, or about sixteen decimal digits). So machine floating-point comparisons on them can't be expected to produce precise answers.

On the other hand\, converting such numbers to integers will work (if 64-bit integers available)\, at the cost of (a) necessarily losing any non-integer part\, and (b) also losing the rightmost 11 bits of integer precision. So converting large floating-point values to machine integers also can't be expected to produce precise answers.

This program demonstrates what I mean:

for my $i (123456789123456776 .. 123456789123456792) { my $f = $i + 0.25; printf "%d %20.1f %s\n"\, $i\, $f\, unpack "H*"\, pack 'F'\, $f; }

When run on a Perl with 64-bit IV\, the ".." loop will use precise integer arithmetic. The loop body then adds 0.25 to the integer value; that must necessarily use floating-point arithmetic\, so the sum is floating-point. It then prints out the original integer\, the sum\, and a hex string containing the internal representation of the floating-point sum. On my Perl (little-endian\, 64-bit IV\, 64-bit NV)\, we get the following output:

123456789123456776 123456789123456768.0 f005cdbab4697b43 123456789123456777 123456789123456784.0 f105cdbab4697b43 123456789123456778 123456789123456784.0 f105cdbab4697b43 123456789123456779 123456789123456784.0 f105cdbab4697b43 123456789123456780 123456789123456784.0 f105cdbab4697b43 123456789123456781 123456789123456784.0 f105cdbab4697b43 123456789123456782 123456789123456784.0 f105cdbab4697b43 123456789123456783 123456789123456784.0 f105cdbab4697b43 123456789123456784 123456789123456784.0 f105cdbab4697b43 123456789123456785 123456789123456784.0 f105cdbab4697b43 123456789123456786 123456789123456784.0 f105cdbab4697b43 123456789123456787 123456789123456784.0 f105cdbab4697b43 123456789123456788 123456789123456784.0 f105cdbab4697b43 123456789123456789 123456789123456784.0 f105cdbab4697b43 123456789123456790 123456789123456784.0 f105cdbab4697b43 123456789123456791 123456789123456784.0 f105cdbab4697b43 123456789123456792 123456789123456800.0 f205cdbab4697b43

Note that everything in X+77 .. X+91 inclusive has the same representation as a floating point; the leading "f1" is only one bit away from the leading "f0" seen in X+76\, so that single-bit difference is the limit of floating-point precision for numbers at this scale. The X+84 value emitted for all the floating-point sums is (roughly) in the middle of the range that the relevant representation mathematically represents.

That's why I say this can't be meaningfully fixed without using some kind of rational or bignum representation for numbers. In principle\, I dare say Perl 5 could perhaps be reworked to use such a representation internally where necessary. But the current trade-off is to use machine arithmetic internally\, and allow users to get precise (but slow) arithmetic with the bigint\, bigrat\, or bignum pragmas.

-- Aaron Crane ** http://aaroncrane.co.uk/

p5pRT commented 11 years ago

From @andk

"Aaron Crane via RT" \perlbug\-followup@perl\.org writes:

Andreas\, I apologise;

Accepted;)

While this is mathematically unexpected\, I don't think it represents a bug in Perl. The eighteen-digit numbers in question are too big to have a precise representation in a typical 64-bit floating-point value (since those have a precision limit of 53 bits\, or about sixteen decimal digits). So machine floating-point comparisons on them can't be expected to produce precise answers.

There are not precise answers and there are outspoken miserable answers. Saying two very large numbers are equal is not precise but saying the lower is the larger is miserable.

On the other hand\, converting such numbers to integers will work (if 64-bit integers available)\, at the cost of (a) necessarily losing any non-integer part\, and (b) also losing the rightmost 11 bits of integer precision. So converting large floating-point values to machine integers also can't be expected to produce precise answers.

Converting to intergers would also be wrong. If the user expects a precise answer we can tell him where to go. That's probably not the problem we have to solve.

This program demonstrates what I mean:

for my $i (123456789123456776 .. 123456789123456792) { my $f = $i + 0.25; printf "%d %20.1f %s\n"\, $i\, $f\, unpack "H*"\, pack 'F'\, $f; }

When run on a Perl with 64-bit IV\, the ".." loop will use precise integer arithmetic. The loop body then adds 0.25 to the integer value; that must necessarily use floating-point arithmetic\, so the sum is floating-point. It then prints out the original integer\, the sum\, and a hex string containing the internal representation of the floating-point sum. On my Perl (little-endian\, 64-bit IV\, 64-bit NV)\, we get the following output:

123456789123456776 123456789123456768.0 f005cdbab4697b43 123456789123456777 123456789123456784.0 f105cdbab4697b43 123456789123456778 123456789123456784.0 f105cdbab4697b43 123456789123456779 123456789123456784.0 f105cdbab4697b43 123456789123456780 123456789123456784.0 f105cdbab4697b43 123456789123456781 123456789123456784.0 f105cdbab4697b43 123456789123456782 123456789123456784.0 f105cdbab4697b43 123456789123456783 123456789123456784.0 f105cdbab4697b43 123456789123456784 123456789123456784.0 f105cdbab4697b43 123456789123456785 123456789123456784.0 f105cdbab4697b43 123456789123456786 123456789123456784.0 f105cdbab4697b43 123456789123456787 123456789123456784.0 f105cdbab4697b43 123456789123456788 123456789123456784.0 f105cdbab4697b43 123456789123456789 123456789123456784.0 f105cdbab4697b43 123456789123456790 123456789123456784.0 f105cdbab4697b43 123456789123456791 123456789123456784.0 f105cdbab4697b43 123456789123456792 123456789123456800.0 f205cdbab4697b43

This is a very nice demonstration of what's going on\, thank you.

Note that everything in X+77 .. X+91 inclusive has the same representation as a floating point; the leading "f1" is only one bit away from the leading "f0" seen in X+76\, so that single-bit difference is the limit of floating-point precision for numbers at this scale. The X+84 value emitted for all the floating-point sums is (roughly) in the middle of the range that the relevant representation mathematically represents.

That's why I say this can't be meaningfully fixed without using some kind of rational or bignum representation for numbers. In principle\, I dare say Perl 5 could perhaps be reworked to use such a representation internally where necessary. But the current trade-off is to use machine arithmetic internally\, and allow users to get precise (but slow) arithmetic with the bigint\, bigrat\, or bignum pragmas.

We have three answers:

X+89.2 > X+90 # (A) some perls believe this is true

X+89.2 == X+90.0 # (B) all perls agree on this one (iirc)

X+89.2 \< X+90 # (C) bigfoats can deal with this

I know where I can get (C): with big modules. This is not the task to solve.

But I find it unacceptable that (A) can happen while (B) also can happen. It is unperlish that the user has to append the ".0". And if "we" find it acceptable\, then it should be documented. Is it?

-- andreas