Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.86k stars 528 forks source link

Math::BigFloat too low precision #5117

Closed p5pRT closed 21 years ago

p5pRT commented 22 years ago

Migrated from rt.perl.org#8641 (status was 'resolved')

Searchable as RT8641$

p5pRT commented 22 years ago

From ilya@math.ohio-state.edu

This is a bug report for perl from vera@​ia-ia.nonet\, generated with the help of perlbug 1.33 running under perl v5.7.2.

use Math​::BigFloat;

Math​::BigFloat​::precision(80);

$x = .2345678901234567; $x++; print "expect = 1.2345678901234567 or so\n"; printf "binary = %.20g\nperl = %s\n"\, $x\, $x;

$x = Math​::BigFloat->new($x); print "bigfl = $x\n";

This prints​:

expect = 1.2345678901234567 or so binary = 1.2345678901234566904 perl = 1.23456789012346 bigfl = 1.23456789012346

As one can see\, the precision of NV is as expected\, close to 18 significant digits (counting the leading 1). However\, the precision of the generated BigFloat is 330 times worse. Surely the situation should have been opposite...


Flags​:   category=core   severity=medium


Site configuration information for perl v5.7.2​:

Configured by vera at Wed Feb 13 15​:23​:25 PST 2002.

Summary of my perl5 (revision 5.0 version 7 subversion 2 patch 14574) configuration​:   Platform​:   osname=os2\, osvers=2.30\, archname=os2   uname='os2 ia-ia 2 2.30 i386 '   config_args='-des -Dusedevel -Dprefix=i​:/perllib'   hint=recommended\, useposix=true\, d_sigaction=define   usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef   useperlio=define d_sfio=undef uselargefiles=define usesocks=undef   use64bitint=undef use64bitall=undef uselongdouble=undef   usemymalloc=y\, bincompat5005=define   Compiler​:   cc='gcc'\, ccflags ='-Zomf -Zmt -DDOSISH -DOS2=2 -DEMBED -I. -D_EMX_CRT_REV_=63 -I/usr/local/include'\,   optimize='-O2 -fomit-frame-pointer -malign-loops=2 -malign-jumps=2 -malign-functions=2 -s'\,   cppflags='-Zomf -Zmt -DDOSISH -DOS2=2 -DEMBED -I. -D_EMX_CRT_REV_=63 -I/usr/local/include'   ccversion=''\, gccversion='2.8.1'\, gccosandvers=''   intsize=4\, longsize=4\, ptrsize=4\, doublesize=8\, byteorder=1234   d_longlong=define\, longlongsize=8\, d_longdbl=define\, longdblsize=12   ivtype='long'\, ivsize=4\, nvtype='double'\, nvsize=8\, Off_t='off_t'\, lseeksize=4   alignbytes=4\, prototype=define   Linker and Libraries​:   ld='gcc'\, ldflags ='-Zexe -Zomf -Zmt -Zcrtdll -Zstack 32000 -Zlinker /e​:2'   libpth=i​:/emx.add/lib i​:/emx/lib D​:/DEVTOOLS/OPENGL/LIB I​:/JAVA11/LIB i​:/emx/lib/mt   libs=-lsocket -lm -lbsd -lcrypt   perllibs=-lsocket -lm -lbsd -lcrypt   libc=i​:/emx/lib/mt/c_import.lib\, so=dll\, useshrplib=true\, libperl=libperl.lib   Dynamic Linking​:   dlsrc=dl_dlopen.xs\, dlext=dll\, d_dlsymun=undef\, ccdlflags=' '   cccdlflags='-Zdll'\, lddlflags='-Zdll -Zomf -Zmt -Zcrtdll -Zlinker /e​:2'

Locally applied patches​:   DEVEL14574


@​INC for perl v5.7.2​:   lib/os2   lib   i​:/perllib/lib/5.7.2/os2   i​:/perllib/lib/5.7.2   i​:/perllib/lib/site_perl/5.7.2/os2   i​:/perllib/lib/site_perl/5.7.2   i​:/perllib/lib/site_perl/5.00553/os2   i​:/perllib/lib/site_perl/5.00553   i​:/perllib/lib/site_perl   .


Environment for perl v5.7.2​:   HOME=j​:/home   LANG=EN_US   LANGUAGE (unset)   LD_LIBRARY_PATH (unset)   LOGDIR (unset)   PATH=...   PERLLIB_PREFIX=f​:/perllib;i​:/perllib   PERL_BADLANG (unset)   PERL_SH_DIR=i​:/bin   SHELL (unset)

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

-----BEGIN PGP SIGNED MESSAGE-----

Moin\,

use Math​::BigFloat;

Math​::BigFloat​::precision(80);

That must be written as Math​::BigFloat->precision(80); - the reason is that Math​::BigFloat simple inherits precision() from BigInt\, and your writing stuffs thus the precision into BigInt\, not BigFloat.

$x = .2345678901234567; $x++; print "expect = 1.2345678901234567 or so\n"; printf "binary = %.20g\nperl = %s\n"\, $x\, $x;

$x = Math​::BigFloat->new($x); print "bigfl = $x\n";

But $x is already truncated\, so BigInt only gets what you give it. There is no way it could extend this to 80 digits.

I think\, it should match the binary\, but I am not shure how the number get's transformed when BigInt 'looks' at it. I think it uses Perl stringification\, to check the validity of the number. So\, the result is actually what I would expect.

Anyway\, I think you stumbled over a real bug​:

  #!/usr/bin/perl -w

  use Math​::BigFloat;  
  Math​::BigFloat->precision(80);  
  print "precision "\,Math​::BigFloat->precision()\,"\n";  
  $x = .2345678901234567;   $x++;   print "expect = 1.2345678901234567 or so\n";   printf "binary = %.20g\nperl = %s\n"\, $x\, $x;  
  $x = Math​::BigFloat->new($x);   print "bigfl = $x\n";

This prints​:

  precision 80   expect = 1.2345678901234567 or so   binary = 1.2345678901234566904   perl = 1.23456789012346   bigfl = 0

The last line is certainly not right...Invesitgating when I did some other work...

Cheers\,

Tels

- -- perl -MDev​::Bollocks -e'print Dev​::Bollocks->rand()\,"\n"' proactively coordinate turn-key networks

http​://bloodgate.com/perl My current Perl projects PGP key available on http​://bloodgate.com/tels.asc or via email

-----BEGIN PGP SIGNATURE----- Version​: GnuPG v1.0.6 (GNU/Linux)

iQEVAwUBPHeAancLPEOTuEwVAQGavwf7B0b47cTSzv1Dxhyf/mJCNsO5r9jwHxCY 1lPmwxtU6of4uCS61Xhv+U0tlwcBs6vhUu/61SOl0wzk50qlONbGoNNTkYKp4njP Dz5is6lK3kP5ybIGUcfTAm25zt7Mq0/WgZvJBZFLZVGTZIrJdOpk9sUNBQo3Almg VeZjyn2QoLYBv4UenBXUyd2stWjkz3EgMb2988loTZtiXxMMujt/jQu3H5f9V3uX cvpsC3rSu60SuwzotvkGcYR5WXUO9zqjMyRUS6jgs4c79VpWA42R6KE7iHwrdLqV RXKJF0qTp7/4Pe4Brf7v/uL2MdY+QDMocWXliEb88AjMFjG7j+CZhQ== =uSMe -----END PGP SIGNATURE-----

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

-----BEGIN PGP SIGNED MESSAGE-----

Moin\,

On 23-Feb-02 Tels tried to scribble about​:

-----BEGIN PGP SIGNED MESSAGE----- Moin\,

use Math​::BigFloat;

Math​::BigFloat​::precision(80);

That must be written as Math​::BigFloat->precision(80); - the reason is that Math​::BigFloat simple inherits precision() from BigInt\, and your writing stuffs thus the precision into BigInt\, not BigFloat.

$x = .2345678901234567; $x++; print "expect = 1.2345678901234567 or so\n"; printf "binary = %.20g\nperl = %s\n"\, $x\, $x;

$x = Math​::BigFloat->new($x); print "bigfl = $x\n";

But $x is already truncated\, so BigInt only gets what you give it. There is no way it could extend this to 80 digits.

I think\, it should match the binary\, but I am not shure how the number get's transformed when BigInt 'looks' at it. I think it uses Perl stringification\, to check the validity of the number. So\, the result is actually what I would expect.

Anyway\, I think you stumbled over a real bug​:

    \#\!/usr/bin/perl \-w

    use Math​::BigFloat;

    Math​::BigFloat\->precision\(80\);

    print "precision "\,Math​::BigFloat\->precision\(\)\,"\\n";

    $x = \.2345678901234567;
    $x\+\+;
    print  "expect = 1\.2345678901234567 or so\\n";
    printf "binary = %\.20g\\nperl   = %s\\n"\, $x\, $x;

    $x = Math​::BigFloat\->new\($x\);
    print  "bigfl  = $x\\n";

This prints​:

    precision 80
    expect = 1\.2345678901234567 or so
    binary = 1\.2345678901234566904
    perl   = 1\.23456789012346
    bigfl  = 0

The last line is certainly not right...Invesitgating when I did some other work...

Duh! I forgot to include the =head2 precision() section for easy lookup. I am confusing my own code *sigh*

The reason that it prints 0 is simple\, 80 digits of precision *after* the dot is​:

  Math​::BigFloat->precision(-80);

(Don't blame me\, blame the original authors\, which introduced this wierd notation ;) I would like to change it. Unfortunately\, bfround() always operated that way\, so I keept it consistent). precision(80) roundes to the 80th digit left of the dot.

What you really want is (keeping 80 digits regardless of where the dot is)​:

  Math​::BigFLoat->accuracy(80);

So​:

  te@​null​:\~ > cat prec.pl   #!/usr/bin/perl -w  
  use Math​::BigFloat;  
  print Math​::BigFloat->accuracy(80)\,"\n";  
  $x = .2345678901234567;   $x++;   print "expect = 1.2345678901234567 or so\n";   printf "binary = %.20g\nperl = %s\n"\, $x\, $x;  
  $x = Math​::BigFloat->new($x);   print "bigfl = $x\n";  
  te@​null​:\~ > perl prec.pl   80   expect = 1.2345678901234567 or so   binary = 1.2345678901234566904   perl = 1.23456789012346   bigfl = 1.2345678901234600000000000000000000000000000000000000000000000000000000000 00000

(linebreak due to my mailer)

Cheers\,

Tels

- -- perl -MDev​::Bollocks -e'print Dev​::Bollocks->rand()\,"\n"' proactively embrace impactful content

http​://bloodgate.com/perl My current Perl projects PGP key available on http​://bloodgate.com/tels.asc or via email

-----BEGIN PGP SIGNATURE----- Version​: GnuPG v1.0.6 (GNU/Linux)

iQEVAwUBPHeCuXcLPEOTuEwVAQFSCAf9EURx77pokP9PiwANBJKdZt84PUDGwkVY oZkDuo/bi+aKMHFQCpJoyhIvy6CTHR428aScwzW12YVeo0fjNn1KC3mHFI7b6poW N7hvcRgaqY7YdSvS4M5zh77iLr/mKMdLPZvwqNB3dy53UijW6J83YUuH4ugekIfp SkIC4O/SrFhAJ2FS0mUhBcJ+r3yJfoVmCnnLz8J+jSe2JT/m560AnQ0kSPo9q74r QRNU8r0LakKdq5XJ9XhtMuh0ONFVgpT3khEPv+q8fjJZeH6omywz1O/8fV0LgsZG NjLAATr47ivqoBUY/CqbkeS6u1SFK830JO4OWmybRer58p+56y9H8w== =OeH+ -----END PGP SIGNATURE-----

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

On Sat\, Feb 23\, 2002 at 12​:43​:53PM +0100\, Tels wrote​:

$x = .2345678901234567; $x++; print "expect = 1.2345678901234567 or so\n"; printf "binary = %.20g\nperl = %s\n"\, $x\, $x;

$x = Math​::BigFloat->new($x); print "bigfl = $x\n";

But $x is already truncated\,

....To 18 digits

so BigInt only gets what you give it.

No\, it gets things 330 times worse than what I gave it.

There is no way it could extend this to 80 digits.

Of course any decimal can be extended to any length​: just add 0s. But BigFloat is extending a *wrong* number\, not one given to it.

I think\, it should match the binary\, but I am not shure how the number get's transformed when BigInt 'looks' at it. I think it uses Perl stringification\, to check the validity of the number. So\, the result is actually what I would expect.

Well\, then your expectations should be fixed too...

Ilya

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

-----BEGIN PGP SIGNED MESSAGE-----

Moin\,

On 23-Feb-02 Ilya Zakharevich tried to scribble about​:

On Sat\, Feb 23\, 2002 at 12​:43​:53PM +0100\, Tels wrote​:

$x = .2345678901234567; $x++; print "expect = 1.2345678901234567 or so\n"; printf "binary = %.20g\nperl = %s\n"\, $x\, $x;

$x = Math​::BigFloat->new($x); print "bigfl = $x\n";

But $x is already truncated\,

...To 18 digits

BigInt gets the same number than the one Perl prints. If I can access somehow the extra digits\, then please tell me how (I am no Perl expert)

so BigInt only gets what you give it. No\, it gets things 330 times worse than what I gave it.

How do you arrive at 330 exactly\, anyway?

There is no way it could extend this to 80 digits.

Of course any decimal can be extended to any length​: just add 0s.

It does this\, I meant "extend it by the real digits"\, aka​:

  Math​::BigInt->new(1.23456789012345678901234567890);

by the time the number reaches BigInt\, certain digits have already been truncated. There is no way to get them back.

But BigFloat is extending a *wrong* number\, not one given to it.

So\, please tell me how to access the number given to BigInt.

I think\, it should match the binary\, but I am not shure how the number get's transformed when BigInt 'looks' at it. I think it uses Perl stringification\, to check the validity of the number. So\, the result is actually what I would expect. Well\, then your expectations should be fixed too...

So\, please tell me how.

Cheers\,

Tels

- -- perl -MDev​::Bollocks -e'print Dev​::Bollocks->rand()\,"\n"' seamlessly fashion performance-oriented methodologies

http​://bloodgate.com/perl My current Perl projects PGP key available on http​://bloodgate.com/tels.asc or via email

-----BEGIN PGP SIGNATURE----- Version​: GnuPG v1.0.6 (GNU/Linux)

iQEVAwUBPHgEdncLPEOTuEwVAQED5gf+PyRh9bYKkWeAWScbxH8gnvZucjYDRLl6 yv+VpHp4YG5KvY3jXCeNOnWUfHfMBAhkWd2XAi02o5G+jIrRt5S4zQYGv12sFBvJ kMa4QLyF7MOFXVpyjfH+/LBTTL4p+BuhfTxQHMUZT1/zNzXleQw+40zvaZZxjAkF 8Zi7UtrNd1+3zArntnd6/soUfSnFpRPpqfd3477eiiTV/rQX9XLqylYmT1+xy2lA So8/AUXyWEagEJEIl1Bg3Dz0IB4TpqfuYAeqHKff1kxzxayGMHkgfrGGy6CL153b pwCxR72hSqpi3qktArfsiX8LR24F8eqpunB3ynM/HH2kTFD9B/9L/g== =8rtd -----END PGP SIGNATURE-----

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

On Sat\, Feb 23\, 2002 at 10​:07​:07PM +0100\, Tels wrote​:

$x = .2345678901234567; $x++; print "expect = 1.2345678901234567 or so\n"; printf "binary = %.20g\nperl = %s\n"\, $x\, $x;

$x = Math​::BigFloat->new($x); print "bigfl = $x\n";

But $x is already truncated\,

...To 18 digits

BigInt gets the same number than the one Perl prints.

Who cares what Perl prints? The only purpose of the printed value is to peacify "unwashed masses"\, which could flood the support channels with "Why I get 1.10000000000000009 when I asked for 1.1".

If I can access somehow the extra digits\, then please tell me how

One could use printf '%.\g' with appropriate \ (given by precision?). But this only for the case if you know that double value is "better" than the string one.

(I am no Perl expert)

Yes\, I could see it​: yesterday I looked through Math​::BigInt. Sorry for the news\, but IMO\, this is something absolutely horrible... This explained why the code is so slow (I would expect a speedup of circa 100x w.r.t. the old version\, and IIRC\, you report a speedup of 3x).

I can see an obvious progress\, but I hopped for something better... Well\, maybe some time somebody (you? ;-) will improve it yet more - though the choice of the API hinders improvements a lot...

so BigInt only gets what you give it. No\, it gets things 330 times worse than what I gave it.

How do you arrive at 330 exactly\, anyway?

expect = 1.2345678901234567 or so binary = 1.2345678901234566904 bigfl = 1.23456789012346

Perl rounds with error 1.2345678901234567 - 1.2345678901234566904 which is 9.6e-18. Math​::BigFloat's rounding error is 3.3e-15. Which is 343.75 times worse than necessary.

It does this\, I meant "extend it by the real digits"\, aka​:

    Math​::BigInt\->new\(1\.23456789012345678901234567890\);

by the time the number reaches BigInt\, certain digits have already been truncated. There is no way to get them back.

As I did show\, you can get back 2.5 more digits (at least\, in my example).

Ilya

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

-----BEGIN PGP SIGNED MESSAGE-----

Moin\,

On 24-Feb-02 Ilya Zakharevich tried to scribble about​:

On Sat\, Feb 23\, 2002 at 10​:07​:07PM +0100\, Tels wrote​:

$x = .2345678901234567; $x++; print "expect = 1.2345678901234567 or so\n"; printf "binary = %.20g\nperl = %s\n"\, $x\, $x;

$x = Math​::BigFloat->new($x); print "bigfl = $x\n";

But $x is already truncated\,

...To 18 digits BigInt gets the same number than the one Perl prints.

Who cares what Perl prints? The only purpose of the printed value is to peacify "unwashed masses"\, which could flood the support channels with "Why I get 1.10000000000000009 when I asked for 1.1".

I care. Some points​:

First\, if I implement it so that BigInt takes what your binary prints\, then you *will* get questions like​:

  Perl prints 1.1\, why does BigFloat make out of this 1.1000000009?

The real solution is to *NOT* use Perl at all\, input the number via :constant or string into BigFloat and never leave it to Perl. Anything else just calls for trouble. You never (well\, not very predictable) know when Perl rounds\, looses precision/accuracy\, or when not. So\, don't do that.

If I can access somehow the extra digits\, then please tell me how

One could use printf '%.\g' with appropriate \ (given by precision?). But this only for the case if you know that double value is "better" than the string one.

Ah\, and *how* do you know? And how do you know that the double value not only has more digits\, but also the right ones? In case like

  1.1 vs. 1.10000000000009

which is the right value? Did the user say​:

  Math​::BigFloat->new(1.23);

or did he say​:

  Math​::BigFLoat->new(1.229999999999999999999);

BigFloat can not decide this\, this is a thing Perl must do. When Perl touches thenumbers\, and mangles it (or if the float lib dows this)\, then Perl better give me the original input back.

(I am no Perl expert)

Yes\, I could see it​: yesterday I looked through Math​::BigInt. Sorry for the news\, but IMO\, this is something absolutely horrible...

I *knew* you would say that. I get the feeling you say this to every code not written by you\, but that could be me *puts on flame-proof undies*

This explained why the code is so slow (I would expect a speedup of circa 100x w.r.t. the old version\, and IIRC\, you report a speedup of 3x).

Well\, please go​:

  read the GOALS file in the CPAN distribution   http​://bloodgate.com/perl/bigint/benchmarks.html

Three points​:

* patches welcome  
* 100 vs 3 vs old version​: this differs so much for the different   operations (hell\, some are 1000 times faster) that a general roundup like   "version A is x times faster than version B" is totally irrellevant. You   have to look at the different operations\, how often they are used\, what   replacements are there (f.i. usually you do "$x" & 1 to test for odd\,   which is horrible slower than $x->is_odd()).   Benchamrks/comparing things is more than measuring how long 1+1 takes * The library is optimized for large/huge numbers. Not for small ones. * There is no point four\, because I could go on for hours...

I can see an obvious progress\, but I hopped for something better...

Where were your patches/comments/feedback/criticsm ?

Well\, maybe some time somebody (you? ;-) will improve it yet more - though the choice of the API hinders improvements a lot...

If I could\, I would rip out some of the old API\, but I can't do this due to backward compatibility. Not all is my fault.

Apart from that​: What would you propose? How to improve the API? And how would this improve the module?

so BigInt only gets what you give it. No\, it gets things 330 times worse than what I gave it.

How do you arrive at 330 exactly\, anyway?

expect = 1.2345678901234567 or so binary = 1.2345678901234566904 bigfl = 1.23456789012346

Perl rounds with error 1.2345678901234567 - 1.2345678901234566904 which is 9.6e-18. Math​::BigFloat's rounding error is 3.3e-15. Which is 343.75 times worse than necessary.

343.75 != 330 times. Not even when you round it. Anyway\, that is irrelevant\, even it were 1000000000 times.

If even Perl doesn't know exactly what the real value is\, how can BigFloat? I would rather not start to "guessing" what the input meant.

It does this\, I meant "extend it by the real digits"\, aka​:

    Math​::BigInt\->new\(1\.23456789012345678901234567890\);

by the time the number reaches BigInt\, certain digits have already been truncated. There is no way to get them back. As I did show\, you can get back 2.5 more digits (at least\, in my example).

Yeah\, and in some other examples you get back even more (1.23 vs 1.2999999)\, but are this the right digits?

Cheers\,

Tels

- -- perl -MDev​::Bollocks -e'print Dev​::Bollocks->rand()\,"\n"' confidentially promote one-to-one functionalities

http​://bloodgate.com/perl My current Perl projects PGP key available on http​://bloodgate.com/tels.asc or via email

-----BEGIN PGP SIGNATURE----- Version​: GnuPG v1.0.6 (GNU/Linux)

iQEVAwUBPHjHXHcLPEOTuEwVAQE4hQf9HQdQ3KsJyDXnJa/FC4V16F108HcnpTTo 75GZmYQWOa2E5VFV4GfIhL0oQJNqzV2ReQNDVLkADGp2sR7BrdFBaGzhY0VXBJ5A dybT1a2R98n7OK/pkZgulnNHMU7Nr43bUagsrzSYoH1T6FPlcPoi2fVjT7TYKIlz 6jEDvyagQ1WCFjo2MafSQg3xcpVNKyBdygjFz9nPEytgatXvL5FlG4kv7Txx9TCT IzgKHALUzNmrK38v+sYtq6PatNrjIcoaDxEY9hrcSW/pwB1Mz0DPLXokqNZlIV+X 4XHDN4UgVngfetYquxxM4Uc6surv/fQW8ZoUwgM87mAQcXyRlpdMWw== =prVu -----END PGP SIGNATURE-----

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

-----BEGIN PGP SIGNED MESSAGE-----

Moin\,

lmae reply to myself\, I know​:

On 24-Feb-02 Tels tried to scribble about​:

-----BEGIN PGP SIGNED MESSAGE-----

Moin\,

On 24-Feb-02 Ilya Zakharevich tried to scribble about​:

On Sat\, Feb 23\, 2002 at 10​:07​:07PM +0100\, Tels wrote​:

$x = .2345678901234567; $x++; print "expect = 1.2345678901234567 or so\n"; printf "binary = %.20g\nperl = %s\n"\, $x\, $x;

$x = Math​::BigFloat->new($x); print "bigfl = $x\n";

But $x is already truncated\,

...To 18 digits BigInt gets the same number than the one Perl prints.

Who cares what Perl prints? The only purpose of the printed value is to peacify "unwashed masses"\, which could flood the support channels with "Why I get 1.10000000000000009 when I asked for 1.1".

I care. Some points​:

First\, if I implement it so that BigInt takes what your binary prints\, then you *will* get questions like​:

    Perl prints 1\.1\, why does BigFloat make out of this 1\.1000000009?

The real solution is to *NOT* use Perl at all\, input the number via ​:constant or string into BigFloat and never leave it to Perl. Anything ​:else just calls for trouble. You never (well\, not very predictable) know when Perl rounds\, looses precision/accuracy\, or when not. So\, don't do that.

Some might argue\, that Perl is pretty predictable. But when you go cross-platform\, you loose this pretty fast. Some of them don't have FPU hardware\, so they make it in software\, some have wierd fromats (e.g. not IEEE)\, some have less/more bits accuracy\, some have bugs etc etc.

So\, the real solution is not to leave it to Perl. It won't work\, regardless of what I do in BigFloat\, it would end up in some sort of a brushing-under-the-rug dance like Perl does now.

You'll note that in the tests\, f.i. mbimbf.t the values are quoted\, for exactly that reason​:

  ok ($x\,1.23);

can go wrong and produce something like​:

  Expected​: '1.2299999999999'\, got '1.23'

Cheers\,

Tels

PS​: No\, I don't have an idea how to improve Perl. I just know that I don't rely on it's floating-point math too much ;)

- -- perl -MMath​::String -e 'print \ Math​::String->from_number("215960156869840440586892398248")\,"\n"'

http​://bloodgate.com/perl My current Perl projects PGP key available on http​://bloodgate.com/tels.asc or via email.

-----BEGIN PGP SIGNATURE----- Version​: GnuPG v1.0.6 (GNU/Linux)

iQEVAwUBPHjKKncLPEOTuEwVAQEYVAf+ICrEl4opkQrsbcvowQaeVcq+4WYppVbp 2kJhFPpWhZ3doofvseRh/hjBF8rQx6R2BuNKzNm5zwF29B/JRJsFjnRD8D2uqT2p zuLNFhgJwhgwrdv6nwp77dLzhuBcWUBLJee3WJunn9tssUjGkobyUrXNyae+LtEA 7tytnOeKQsZIqtB1t02JU85DTl3YAToHMNkP5TYdO20EBsoQk4TWzFqUA3SBuhWP yz2O6ncz+4x5x7+cAIJedMzAibBYEB/uTb9NLIe2xOiAlv8TmPgpV7V4kjxXnj3O demyGKxU4hunI1jkm9tfBf+HHOZHx0hjVXSvOD5wIl27UxVisKPYEA== =48dA -----END PGP SIGNATURE-----

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

On Sun\, Feb 24\, 2002 at 12​:00​:32PM +0100\, Tels wrote​:

Who cares what Perl prints? The only purpose of the printed value is to peacify "unwashed masses"\, which could flood the support channels with "Why I get 1.10000000000000009 when I asked for 1.1".

I care. Some points​:

First\, if I implement it so that BigInt takes what your binary prints\,

[It makes me very nervous that you use BigInt/BigFloat interchangingly]

then you *will* get questions like​:

    Perl prints 1\.1\, why does BigFloat make out of this 1\.1000000009?

I *never* get this question about Math​::Pari (which makes correct actions)​:

  perl -MMath​::Pari -wle "print PARI 1.1"   1.100000000000000088

The real solution is to *NOT* use Perl at all\, input the number via :constant or string into BigFloat and never leave it to Perl.

This is not a "real solution"\, but an urgly hack in an absense of a real solution.

Anything else just calls for trouble. You never (well\, not very predictable) know when Perl rounds\, looses precision/accuracy\, or when not. So\, don't do that.

If *you* do not\, perldoc perlnumber.

One could use printf '%.\g' with appropriate \ (given by precision?). But this only for the case if you know that double value is "better" than the string one.

Ah\, and *how* do you know?

*I* look for NOK IOK POK flags.

which is the right value? Did the user say​:

    Math​::BigFloat\->new\(1\.23\);

or did he say​:

    Math​::BigFLoat\->new\(1\.229999999999999999999\);

You should have noted that as for a couple of days already\, the latter is more or less equivalent to

  Math​::BigFLoat->new('1.229999999999999999999').

Yes\, I could see it​: yesterday I looked through Math​::BigInt. Sorry for the news\, but IMO\, this is something absolutely horrible...

I *knew* you would say that. I get the feeling you say this to every code not written by you\, but that could be me *puts on flame-proof undies*

Obviously\, there is some correlation\, but I hope not as strong as you suggest.

Three points​:

* patches welcome

Why? Just use Math​::Pari.

Well\, maybe some time somebody (you? ;-) will improve it yet more - though the choice of the API hinders improvements a lot...

If I could\, I would rip out some of the old API\, but I can't do this due to backward compatibility. Not all is my fault.

There is no problem with old API\, just redirect it to a better one. The problem of Math​::BigInt as I see it is that it does the *other* way.

Apart from that​: What would you propose? How to improve the API?

Remove all the APIs\, make all things work through overloading (obviously\, you need to rename the guy). *This* is how people use the module. This will allow enormous speedups.

Never use hashes when arrays will go. Never use decimal when binary will go.

If even Perl doesn't know exactly what the real value is\, how can BigFloat?

Perl knows the value. But BigFloat ignores it.

Ilya

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

-----BEGIN PGP SIGNED MESSAGE-----

Moin\,

[Calling for some experts]

On 24-Feb-02 Ilya Zakharevich tried to scribble about​:

On Sun\, Feb 24\, 2002 at 12​:00​:32PM +0100\, Tels wrote​:

Who cares what Perl prints? The only purpose of the printed value is to peacify "unwashed masses"\, which could flood the support channels with "Why I get 1.10000000000000009 when I asked for 1.1".

I care. Some points​:

First\, if I implement it so that BigInt takes what your binary prints\,

[It makes me very nervous that you use BigInt/BigFloat interchangingly]

Because I always use the term "BigInt" to refer to the project\, since I started with this module. Maybe because it eats my life for over an year now. Bad habid\, sorry.

then you *will* get questions like​:

    Perl prints 1\.1\, why does BigFloat make out of this
    1\.1000000009?

I *never* get this question about Math​::Pari (which makes correct actions)​:

perl -MMath​::Pari -wle "print PARI 1.1" 1.100000000000000088

  te@​null​:\~ > perl -MMath​::Pari -wle "print PARI 1.1"   1.100000000000000088   te@​null​:\~ > perl -MMath​::Pari -wle "print PARI '1.1'"   1.099999999999999999999999999  
The more I think about it\, the more bad I think this is. For reference​:

  te@​null​:\~ > perl -Mbignum -wle 'print 1.1'   1.1
  te@​null​:\~ > perl -MMath​::BigFloat -wle 'print   Math​::BigFloat->new(1.1)'   1.1
  te@​null​:\~ > perl -MMath​::BigFloat -wle 'print   Math​::BigFloat->new("1.1")'   1.1

The real solution is to *NOT* use Perl at all\, input the number via :constant or string into BigFloat and never leave it to Perl.

This is not a "real solution"\, but an urgly hack in an absense of a real solution.

I disagree.

Btw\, it is easy to write a subclass of Math​::BigFloat to do what you want\, although I don't see much point in it.

Anything else just calls for trouble. You never (well\, not very predictable) know when Perl rounds\, looses precision/accuracy\, or when not. So\, don't do that. If *you* do not\, perldoc perlnumber.

One could use printf '%.\g' with appropriate \ (given by precision?). But this only for the case if you know that double value is "better" than the string one.

Ah\, and *how* do you know?

*I* look for NOK IOK POK flags.

which is the right value? Did the user say​:

    Math​::BigFloat\->new\(1\.23\);

or did he say​:

    Math​::BigFLoat\->new\(1\.229999999999999999999\);

You should have noted that as for a couple of days already\, the latter is more or less equivalent to

Math​::BigFLoat->new('1.229999999999999999999').

My point was​: floating point math via float/real/double is *always* loosing things\, since it is trying to represent in binary what you input in decimal. So\, even if BigFloat pulls things out of the variable\, you still don't know if this is the thing that entered there in the first place.

And I don't think it is a good idea that Math​::BigFloat will get different results from

  Math​::BigFloat->new(1.1);

depending on system\, library\, hardware\, perl version\, os\, and day of the week. Math​::BigFloat is to have a means of getting the same result on every system\, every time. While it might not be necc. the same result as Perl gets or the one you expects\, it is in itself consistent.

So\, could we please have somebody elses opinion\, too?

Yes\, I could see it​: yesterday I looked through Math​::BigInt. Sorry for the news\, but IMO\, this is something absolutely horrible...

I *knew* you would say that. I get the feeling you say this to every code not written by you\, but that could be me *puts on flame-proof undies*

Obviously\, there is some correlation\, but I hope not as strong as you suggest.

Heh\, maybe you say this about every piece of code ;)

Three points​:

* patches welcome

Why? Just use Math​::Pari.

I think in this case we do no longer need to talk. If you think the entire BigInt/BigFloat project is so failed/doomed/horrible/wrong that you suggest everybody use Math​::Pari\, I have nothing more to say to you.

(Hint​: Windows. Compiler. Install.)

Well\, maybe some time somebody (you? ;-) will improve it yet more - though the choice of the API hinders improvements a lot...

If I could\, I would rip out some of the old API\, but I can't do this due to backward compatibility. Not all is my fault.

There is no problem with old API\, just redirect it to a better one. The problem of Math​::BigInt as I see it is that it does the *other* way.

Apart from that​: What would you propose? How to improve the API?

Remove all the APIs\, make all things work through overloading (obviously\, you need to rename the guy). *This* is how people use the module. This will allow enormous speedups.

You are aware that *overloaded* math requires copying things around\, and that this is slower than a direct function call which can (sometimes) overwrite the variable in place?

($x = -$x; vs. $x->bneg();)

Apart from that\, the overload is there. So what do you suggest it needs more? *puzzled*

Never use hashes when arrays will go. Never use decimal when binary will go.

Next thing you are saying "Never use Perl when C will go". I think you are getting the purpose of Math​::BigInt and friends *totally* wrong.

I consider it an advantage that it uses hashes\, not arrays. And in some cases it is an *huge* advantage that it is in decimal.

I just wanted an *easy* subclassable Math​::BigInt to make Math​::String. That was a total success (there are quite a couple subclasses now...). That it got faster\, and is able to be driven by Math​::Pari (see Math​::BigInt​::Pari) is a bonus point.

And no\, I prefer hashes instead of arrays\, just like I prefer Perl to C\, or commented code with long variable names over undocumented code featuring $a and $b.

In the future\, please let's keep discussion about Math​::BigInt (and BigFloat ;) in general apart from discussion of special features or things\, okay?

If even Perl doesn't know exactly what the real value is\, how can BigFloat? Perl knows the value. But BigFloat ignores it.

Perl knows *a* value. Not necc. *the* value. ;)

Cheers\,

Tels

- -- perl -MDev​::Bollocks -e'print Dev​::Bollocks->rand()\,"\n"' apprehensively e-enable strategic action-items

http​://bloodgate.com/perl My current Perl projects PGP key available on http​://bloodgate.com/tels.asc or via email

-----BEGIN PGP SIGNATURE----- Version​: GnuPG v1.0.6 (GNU/Linux)

iQEVAwUBPHlDlXcLPEOTuEwVAQEQ6Qf+IUyTAyQAzklYxUMqJ7tITHYlOh8qiteN /FNroY0VHHpGXf8T84F5+aA9chj7ecFNx/UGRvsEHsPZTlSqDJUC9WwucW+7rkti H+hB7BOsejkKCw8V07IImdsTCzMikOETN/EiYhvEhwt1I5YqecYuXQlEy0XIvwsM hhMDCWOOVXOA5W8Rl+QiHOMf70SHtRFrDi+DvdtVvvEjHkvEefC/q4/EpJHHFC/0 Zhxt51y6rRo3Tbw7McUqnxZIUlxNGWHTCmri11fIZpna1/HF4UbCBTFOX5rayiny eT3DDXCLeWgsdy7l9VMZZquMfYRr026L5mcgVGPnk7ch3BfibS1CyQ== =N42f -----END PGP SIGNATURE-----

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

-----BEGIN PGP SIGNED MESSAGE-----

Moin\,

I just noticed (Duh! I wrote it!) that the page

  http​://www.bloodgate.com/perl/bigint/benchmarks.html

has only the averages\, and you need the full version to see all numbers​:

  http​://www.bloodgate.com/perl/bigint/v1.48-v1.49.tgz

The averages are pretty meaningless\, since they lump together too different things like 0+0 and 1e10000 + 1e10000. So\, please look at the full version\, which has all the gore and details.

I should have put some warning at top or something. Sorry.

Cheers\,

Tels

PS​: Yes\, BigFloats performance is pretty low. I hadn't had much time to optimze it\, there were too many bugs/features-to-add/modules-to-do. Stay tuned.

- -- perl -MDev​::Bollocks -e'print Dev​::Bollocks->rand()\,"\n"' globally develop exceptional communities

http​://bloodgate.com/perl My current Perl projects PGP key available on http​://bloodgate.com/tels.asc or via email

-----BEGIN PGP SIGNATURE----- Version​: GnuPG v1.0.6 (GNU/Linux)

iQEVAwUBPHlF0HcLPEOTuEwVAQHWrwf5AT+7jcSy5Z4uD5XT20p1lrtllBPZAx1X Kq2SL/QFnmh8PT9IqcgxDvrQ3eq35OkmBTPb/12h5nJgO1KwODc8O4F93y8vis46 lTu0kY11tiX6sV6SWLD89TS/IVx1k1uNrkuNwsCR3iCy+bwQIFmiDEDBSXqPaKh8 DawHM+w+eS8QZbq3byHB6iI5Casf1dYuwHnHiWa1JrpbT+VZVx0z/R2Wfxn66Xjb V/24dHxXw3p4NshFNLh2pn16WHhcbtqJixRO8tAMT8HtQyQXCRFCY+XzUP49o+Rx 0bhFXChqAKDo4zBFBFIpD3NIRnTB9pLrMD5J6zYqgAQ/wl4E8Fns5Q== =qJl4 -----END PGP SIGNATURE-----

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

On Sun\, Feb 24\, 2002 at 08​:49​:22PM +0100\, Tels wrote​:

My point was​: floating point math via float/real/double is *always* loosing things\, since it is trying to represent in binary what you input in decimal.

There is no reason to lose as much as Math​::BigFloat does.

Remove all the APIs\, make all things work through overloading (obviously\, you need to rename the guy). *This* is how people use the module. This will allow enormous speedups.

You are aware that *overloaded* math requires copying things around\,

Not more than non-overloaded stuff.

Apart from that\, the overload is there. So what do you suggest it needs more?

Speed.

Ilya

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

On Sun\, Feb 24\, 2002 at 09​:02​:09PM +0100\, Tels wrote​:

    http​://www\.bloodgate\.com/perl/bigint/benchmarks\.html

has only the averages\, and you need the full version to see all numbers​:

    http​://www\.bloodgate\.com/perl/bigint/v1\.48\-v1\.49\.tgz

The averages are pretty meaningless\,

The pages are *absolutely* meaningless now\, since there is no explanation what the numbers mean.

PS​: Yes\, BigFloats performance is pretty low.

I did not look into BigFloat; BigInt only.

Ilya

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

-----BEGIN PGP SIGNED MESSAGE-----

Moin\,

On 25-Feb-02 Ilya Zakharevich tried to scribble about​:

On Sun\, Feb 24\, 2002 at 09​:02​:09PM +0100\, Tels wrote​:

    http​://www\.bloodgate\.com/perl/bigint/benchmarks\.html

has only the averages\, and you need the full version to see all numbers​:

    http​://www\.bloodgate\.com/perl/bigint/v1\.48\-v1\.49\.tgz

The averages are pretty meaningless\,

The pages are *absolutely* meaningless now\, since there is no explanation what the numbers mean.

I thought that "operations/per second" sounds pretty meaningfull ;) Please download the full version (link is embeded in each benchmark)\, this has all the gory details.

Cheers\,

Tels

- -- perl -MDev​::Bollocks -e'print Dev​::Bollocks->rand()\,"\n"' synergistically expedite B2B e-markets

http​://bloodgate.com/perl My current Perl projects PGP key available on http​://bloodgate.com/tels.asc or via email

-----BEGIN PGP SIGNATURE----- Version​: GnuPG v1.0.6 (GNU/Linux)

iQEVAwUBPHmRiXcLPEOTuEwVAQHwnwf/UNUNA9XdsnQq1vdRyYWCdohp5cP8qaAY UQNKaJi3V6rI1kO2Dz/2/JroOZ3akD0Lp2zg/DtTlMZhKpD0zuklohGcNUUMs1UP 5zwkWmh5rHWPGqpz6MjrXVPnrprFiRGg+sY7c84Z+EFdP2+pM0VfrMu51jHdR6zN 6a0QsL8JhgWDLeqDfmM9/zTxdZB3mBPJb31vHpAlsRrfLi1oxg5iueMNS4DoXEf9 fjM5mlHtT+Qhi+5687XgHlh+nPrDMbaR3WAq+ZiTHeB47qFn9cDZsUYtiWEq+4O6 ifh31+igliPAndhbLHgqZUGR5phubuV2xtF77Aq9x7rurIri9n3J3w== =BIL6 -----END PGP SIGNATURE-----

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

-----BEGIN PGP SIGNED MESSAGE-----

Moin\,

On 25-Feb-02 Ilya Zakharevich tried to scribble about​:

On Sun\, Feb 24\, 2002 at 08​:49​:22PM +0100\, Tels wrote​:

You are aware that *overloaded* math requires copying things around\, Not more than non-overloaded stuff.

  $a = $a + $b;

The overload code does not know that it does not need to preserve $a\, so it makes a copy of $a\, adds $b\, and stuffs the result back in $a.

  $a->badd($b);

saves the copy. Doesn't matter when you can copy fast (like Pari)\, but does matter when the copy is slow\, like in $copy = [ @​$a ]; e.g. matters for Math​::BigInt\, unfortunately.

Apart from that\, the overload is there. So what do you suggest it needs more? Speed.

I am probably repeating myself​:

Math​::BigInt has fairly high constant overhead (this is probably what you mean by factor 100 vs factor 3).

Any speed improvements you get from optimizing the constant will only help you when the operation in itself takes little time\, like adding 1 and 1 together. When the numbers grow this becomes meaningless. Example​:

  $x = 77777 ** 777777;

Takes about 56 minutes on my system. It won't matter whether the overload section or bpow() in itself spent 0.00001 seconds more\, or not.

Having said this\, patches/sugesstions to improve the speed for small arguments are always welcome\, but reverting the code back to an un-maintainable mess (like​: using arrays instead of hashes)\, or any other hacks (like copy&pasting large sections) are not good. The goal to being able to add 1 and 1 fast together is just not worth it. If you need this\, use Perl\, Math​::Pari or whatever. Math​::BigInt is for different things.

Or in other words​: Patches to improve the current algorithmns (like having a faster mul()\, or a faster self-mul to improve bpow()) are more important than patches that decrease the constant overhead per operation. Especially if the latter consists of what you suggest\, e.g. drpoping good software praxis to achieve a questionable goal (Math​::BigInt is pure-perl and can never be as fast as XS or C code).

I rather have a correct\, but slower working module\, than a theoretical faster\, but non-existing one\, or a faster\, but un-maintainable version (Math​::Fraction comes to the mind as an absolute (and borken) mess).

(It is amazing that you get flamed for making a module merely 3 times faster\, because someone pulls out a number out of the thin air like​: "I thought it should be 100 times faster")

Cheers.

Te"Excuse my mood\, but it is 2​:57 am local time..."ls

- -- perl -MDev​::Bollocks -e'print Dev​::Bollocks->rand()\,"\n"' interactively expedite vertical data

http​://bloodgate.com/perl My current Perl projects PGP key available on http​://bloodgate.com/tels.asc or via email

-----BEGIN PGP SIGNATURE----- Version​: GnuPG v1.0.6 (GNU/Linux)

iQEVAwUBPHmabncLPEOTuEwVAQEhjAf7BhMzcr8y/RppYahtOfy8eNNaVG7fHh+F /FNR6o+XIwUP5IwDpa8Ovqjb02N8v4XTXF8Qtg/8wONVEFxNVgMnQW3sPYZ68Fa7 mjTpvvvtTiiFGqIrESvgo0PFmMeYpL7KMf5oWeWhfccqgb2JMPmq8aogOwVEx5uT Synaq5Ta6v5hrPYLXrP4VRPTru+4jHSCzeIOn25Kj2HexpMAipWV7pH0GdbLGuZC Ncskb9p8km4XA1iaw5Wl9dFPJ5Vr/SksbQIjOL/2KweCaFjsuzqWjjLQqhmLzUCd UOXg19PQJmNAWRQhczn+XUjCK351gBUZ+Gwy1ZmMgQK2m7QMlYBR/Q== =sqoo -----END PGP SIGNATURE-----

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

On Mon\, Feb 25\, 2002 at 02​:59​:28AM +0100\, Tels wrote​:

You are aware that *overloaded* math requires copying things around\, Not more than non-overloaded stuff.

    $a = $a \+ $b;

The overload code does not know that it does not need to preserve $a\, so it makes a copy of $a\, adds $b\, and stuffs the result back in $a.

    $a\->badd\($b\);

saves the copy.

Just use

  $a += $b;

Apart from that\, the overload is there. So what do you suggest it needs more? Speed.

I am probably repeating myself​:

Math​::BigInt has fairly high constant overhead

What I see is that Math​::BigInt does a lot of method calls. Method calls are *slow*. It uses a lot of hash lookup (with 2 keys!). Hash lookups are *slow*.

Ilya

p5pRT commented 22 years ago

From @nwc10

On Mon\, Feb 25\, 2002 at 12​:54​:16AM -0500\, Ilya Zakharevich wrote​:

What I see is that Math​::BigInt does a lot of method calls. Method calls are *slow*. It uses a lot of hash lookup (with 2 keys!). Hash lookups are *slow*.

Definitely. Recent example\, in ext/Encode/compile​:

# Do the hash lookup once\, rather than once per function call. 4% speedup. my $type_func = $encode_types{$type};

Yes\, 4% speedup for the whole script just by pulling that one constant lookup out of a loop and caching the result in a lexical variable.

In perl there may well be more than one way to do it\, from which it seems to be no great leap to infer that one of the other ways of doing it is faster than the one you first used. What is surprising is *how* much faster it can turn out to be.

Nicholas Clark -- EMCFT http​://www.ccl4.org/~nick/CV.html

p5pRT commented 22 years ago

From @schwern

On Mon\, Feb 25\, 2002 at 11​:10​:37AM +0000\, Nicholas Clark wrote​:

On Mon\, Feb 25\, 2002 at 12​:54​:16AM -0500\, Ilya Zakharevich wrote​:

What I see is that Math​::BigInt does a lot of method calls. Method calls are *slow*. It uses a lot of hash lookup (with 2 keys!). Hash lookups are *slow*.

*sigh* Hash and method lookups aren't **-->>>SLOW\<\<\<--** they're just a bit slower. You're not going to get massive performance boosts by replacing hash lookups with array lookups and method calls with function calls. At most 15% (down to nothing on certain architectures\, like PowerPC) which isn't particularly great especially given the disruption that sort of structural downgrading can cause.

Definitely. Recent example\, in ext/Encode/compile​:

# Do the hash lookup once\, rather than once per function call. 4% speedup. my $type_func = $encode_types{$type};

Yes\, 4% speedup for the whole script just by pulling that one constant lookup out of a loop and caching the result in a lexical variable.

This is the right sort of optimization. Pulling as much work out of a loop as possible. That's independent of whatever the the thing being pulled out was. Hash\, array\, function call\, method...

Ilya\, if you'd do some performance profiling of Math​::Big* and show Tels some concrete hot spots and maybe a patch or two to speed things up that would be grand.

Sweeping generalizations don't make code go faster.

--

Michael G. Schwern \schwern@&#8203;pobox\.com http​://www.pobox.com/~schwern/ Perl Quality Assurance \perl\-qa@&#8203;perl\.org Kwalitee Is Job One Nature is pissed.   http​://www.unamerican.com/

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

On Mon\, Feb 25\, 2002 at 11​:42​:03AM -0500\, Michael G Schwern wrote​:

*sigh* Hash and method lookups aren't **-->>>SLOW\<\<\<--** they're just a bit slower. You're not going to get massive performance boosts by replacing hash lookups with array lookups and method calls with function calls. At most 15% (down to nothing on certain architectures\, like PowerPC) which isn't particularly great especially given the disruption that sort of structural downgrading can cause.

time perl -wle "@​a=(0..39); $n=shift; $x=0; $x += $a[10] while --$n" 5e6 time perl -wle "%a=(0..39); $n=shift; $x=0; $x += $a{10} while --$n" 5e6

This is 2.52s vs 5.66s here. Hardly 15%.

Hashes are\, of course\, much more *convenient*. However\, with named constants as indices\, arrays may be close. Especially if only 2 keys are used. ;-)

Ilya\, if you'd do some performance profiling of Math​::Big* and show Tels some concrete hot spots and maybe a patch or two to speed things up that would be grand.

I wanted to do exactly this; this is why I looked into it. But\, IMO\, the code is hopeless optimization-wise. There are so many mis-designs\, that finding hot spots is a *very* hard work. And I do not have time for this.

Especially troublesome is the API\, which encourages endless method calls.

Ilya

p5pRT commented 22 years ago

From @JohnPeacock

Ilya Zakharevich wrote​:

time perl -wle "@​a=(0..39); $n=shift; $x=0; $x += $a[10] while --$n" 5e6 time perl -wle "%a=(0..39); $n=shift; $x=0; $x += $a{10} while --$n" 5e6

This is 2.52s vs 5.66s here. Hardly 15%.

Under Cygwin\, I get 4.456s vs 5.137s which is (if I am not mistaken) just about 15%. On a somewhat loaded Linux box\, I see 20.46s vs 27.83s\, which is more like 36%. So YMMV in a *big* way.

Hashes are\, of course\, much more *convenient*. However\, with named constants as indices\, arrays may be close. Especially if only 2 keys are used. ;-)

And arrays are much *less* convenient from the point of view of subclasses\, since you have to assign the slots ahead of time. If I were to subclass Math​::BigFloat (for example Math​::Currency) and I wanted to store a new object attribute (for example format)\, without the object-as-hash paradigm\, where do I put it? Do I create a hash\, wrap the Math​::BigFloat array object\, and then write trivial wrappers for all overloaded ops. This is not a theoretical example\, since that module does rely on the hash nature of Math​::BigFloat.

John

-- John Peacock Director of Information Research and Technology Rowman & Littlefield Publishing Group 4720 Boston Way Lanham\, MD 20706 301-459-3366 x.5010 fax 301-429-5747

p5pRT commented 22 years ago

From @mjdominus

Ilya says​:

On Mon\, Feb 25\, 2002 at 11​:42​:03AM -0500\, Michael G Schwern wrote​:

*sigh* Hash and method lookups aren't **-->>>SLOW\<\<\<--** they're just a bit slower. You're not going to get massive performance boosts by replacing hash lookups with array lookups and method calls with function calls. At most 15% (down to nothing on certain architectures\, like PowerPC) which isn't particularly great especially given the disruption that sort of structural downgrading can cause.

time perl -wle "@​a=(0..39); $n=shift; $x=0; $x += $a[10] while --$n" 5e6 time perl -wle "%a=(0..39); $n=shift; $x=0; $x += $a{10} while --$n" 5e6

This is 2.52s vs 5.66s here. Hardly 15%.

This exaggerates the problem. Arithmetic sequences of numbers cause an unusual number of hash collisions\, and the index 10 that you have chosen is in the fourth out of five positions in its linked list.

However\, I agree with you in general\, and this is one reason why I did not want to inherit from Math​::BigInt when I was working on Math​::BigRat.

Choosing a hash implementation over an array implementation so that subclasses can say $n->{sign} instead of $n->[SIGN] was a poor design decision\, in my opinion\, and the choice of interface commits Math​::BigInt to using hashes instead of arrays forever.

p5pRT commented 22 years ago

From @schwern

On Mon\, Feb 25\, 2002 at 03​:04​:55PM -0500\, Ilya Zakharevich wrote​:

On Mon\, Feb 25\, 2002 at 11​:42​:03AM -0500\, Michael G Schwern wrote​:

*sigh* Hash and method lookups aren't **-->>>SLOW\<\<\<--** they're just a bit slower. You're not going to get massive performance boosts by replacing hash lookups with array lookups and method calls with function calls. At most 15% (down to nothing on certain architectures\, like PowerPC) which isn't particularly great especially given the disruption that sort of structural downgrading can cause.

time perl -wle "@​a=(0..39); $n=shift; $x=0; $x += $a[10] while --$n" 5e6 time perl -wle "%a=(0..39); $n=shift; $x=0; $x += $a{10} while --$n" 5e6

This is 2.52s vs 5.66s here. Hardly 15%.

I presume your "here" is OS/2? Its totally different on my end​:

$ time perl5.6.1 -wle '%a=(0..39); $n=shift; $x=0; $x += $a{10} while --$n' 5e5

real 0m3.534s user 0m1.660s sys 0m0.030s

$ time perl5.6.1 -wle '@​a=(0..39); $n=shift; $x=0; $x += $a[10] while --$n' 5e5

real 0m3.040s user 0m1.440s sys 0m0.050s

where "here" is a G3/266 Powerbook running Debian/PowerPC. Things change again in bleadperl\, but I don't happen to have an optimized copy handy.

$ time perl -wle '@​a=(0..39); $n=shift; $x=0; $x += $a[10] while --$n' 5e6

real 0m6.586s user 0m6.110s sys 0m0.050s $ time perl -wle '%a=(0..39); $n=shift; $x=0; $x += $a{10} while --$n' 5e6

real 0m7.478s user 0m7.160s sys 0m0.030s

That's 5.6.0 on an Redhat/x86 machine.

I have a small battery of hash benchmarks here (feel free to extend them) http​://www.pobox.com/~schwern/src/hashes

I'd like to see what they come out like on an OS/2 machine.

Ilya\, if you'd do some performance profiling of Math​::Big* and show Tels some concrete hot spots and maybe a patch or two to speed things up that would be grand.

I wanted to do exactly this; this is why I looked into it. But\, IMO\, the code is hopeless optimization-wise. There are so many mis-designs\, that finding hot spots is a *very* hard work. And I do not have time for this.

Especially troublesome is the API\, which encourages endless method calls.

This is particularly amusing given that Tels just sped things up by several times with a little profiling and a two line change.

--

Michael G. Schwern \schwern@&#8203;pobox\.com http​://www.pobox.com/~schwern/ Perl Quality Assurance \perl\-qa@&#8203;perl\.org Kwalitee Is Job One WOOHOO! I'm going to Disneyland!   http​://www.goats.com/archive/980805.html

p5pRT commented 22 years ago

From @JohnPeacock

Mark-Jason Dominus wrote​:

Choosing a hash implementation over an array implementation so that subclasses can say $n->{sign} instead of $n->[SIGN] was a poor design decision\, in my opinion\, and the choice of interface commits Math​::BigInt to using hashes instead of arrays forever.

And how does a subclass do $n->[SOMETHING_NEW] such that it does not interfere with other subclasses? Subclasses frequently include new attributes not present in the base class. Should a specific element in the array be defined as a ref to a hash for subclass attributes?

And I think the hash implementation certainly beats the original non-object code which stored only a normalized string and parsed it *ever*single*time* it did anything...

John

-- John Peacock Director of Information Research and Technology Rowman & Littlefield Publishing Group 4720 Boston Way Lanham\, MD 20706 301-459-3366 x.5010 fax 301-429-5747

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

On Mon\, Feb 25\, 2002 at 03​:31​:17PM -0500\, John Peacock wrote​:

Ilya Zakharevich wrote​:

time perl -wle "@​a=(0..39); $n=shift; $x=0; $x += $a[10] while --$n" 5e6 time perl -wle "%a=(0..39); $n=shift; $x=0; $x += $a{10} while --$n" 5e6

This is 2.52s vs 5.66s here. Hardly 15%.

Under Cygwin\, I get 4.456s vs 5.137s which is (if I am not mistaken) just about 15%. On a somewhat loaded Linux box\, I see 20.46s vs 27.83s\, which is more like 36%. So YMMV in a *big* way.

I tried it on Sparc9\, and see 9.25s vs 27.16s. Something is very fishy on your systems... (5.7.2@​14577)

And arrays are much *less* convenient from the point of view of subclasses\, since you have to assign the slots ahead of time. If I were to subclass Math​::BigFloat (for example Math​::Currency) and I wanted to store a new object attribute (for example format)\, without the object-as-hash paradigm\, where do I put it?

Have a SLOT negotiation API. Do it in a BEGIN block\, then assign constants appropriately.

Ilya

p5pRT commented 22 years ago

From @mjdominus

Mark-Jason Dominus wrote​:

Choosing a hash implementation over an array implementation so that subclasses can say $n->{sign} instead of $n->[SIGN] was a poor design decision\, in my opinion\, and the choice of interface commits Math​::BigInt to using hashes instead of arrays forever.

And how does a subclass do $n->[SOMETHING_NEW] such that it does not interfere with other subclasses?

Is that a serious or a rhetorical question? I can think of at least two different ways to do it\, and there are probably others that I have not considered.

And I think the hash implementation certainly beats the original non-object code which stored only a normalized string and parsed it *ever*single*time* it did anything...

I don't think anyone has claimed that Tels has picked the *worst* possible implementation\, so I'm not sure what relevance I'm supposed to attribute to your observation.

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

On Mon\, Feb 25\, 2002 at 03​:31​:17PM -0500\, John Peacock wrote​:

Ilya Zakharevich wrote​:

time perl -wle "@​a=(0..39); $n=shift; $x=0; $x += $a[10] while --$n" 5e6 time perl -wle "%a=(0..39); $n=shift; $x=0; $x += $a{10} while --$n" 5e6

This is 2.52s vs 5.66s here. Hardly 15%.

Under Cygwin\, I get 4.456s vs 5.137s which is (if I am not mistaken) just about 15%. On a somewhat loaded Linux box\, I see 20.46s vs 27.83s\, which is more like 36%. So YMMV in a *big* way.

More correct details​:

  2.52s vs 5.66s 5.7.2 @​14577 gcc 2.8.2 Athlon 850 OS/2   9 vs 26 5.5.30 Sun CC Sparc9 450? Sol8

So I get compatible result in quite orthogonal situations...

Ilya

p5pRT commented 22 years ago

From @JohnPeacock

Ilya Zakharevich wrote​:

Under Cygwin\, I get 4.456s vs 5.137s which is (if I am not mistaken) just about 15%. On a somewhat loaded Linux box\, I see 20.46s vs 27.83s\, which is more like 36%. So YMMV in a *big* way.

I tried it on Sparc9\, and see 9.25s vs 27.16s. Something is very fishy on your systems... (5.7.2@​14577)

I don't warrant a Sparc9; Cygwin is\, by definition\, somewhat fishy\, and the other box is a dual Pentium Pro 200\, so I'll admit that one is funky too. I was also running 5.6.1\, so any recent performance increases are not there. But\, I ran it on a more recent machine and still see nothing like your increases.

John

-- John Peacock Director of Information Research and Technology Rowman & Littlefield Publishing Group 4720 Boston Way Lanham\, MD 20706 301-459-3366 x.5010 fax 301-429-5747

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

Ilya Zakharevich writes​:

time perl -wle "@​a=(0..39); $n=shift; $x=0; $x += $a[10] while --$n" 5e6 time perl -wle "%a=(0..39); $n=shift; $x=0; $x += $a{10} while --$n" 5e6

I get​:   $ time perl -wle '@​a=(0..39); $n=shift; $x=0; $x += $a[10] while --$n' 5e6

  real 0m34.03s   user 0m17.22s   sys 0m0.01s   $ time perl -wle '%a=(0..39); $n=shift; $x=0; $x += $a{10} while --$n' 5e6

  real 0m36.85s   user 0m18.61s   sys 0m0.02s on a Sun Ultra 30 with a perl v5.6.0 compiled with Sun WorkShop Compilers 4.2\, which is around 8% difference in user+sys time.

I also get​:   13​:50>time p​:/bin/perl -wle '%a=(0..39); $n=shift; $x=0; $x += $a{10} while --$n' 5e6

  real 0m 5.38s   user 0m 5.06s   sys 0m 0.03s   13​:50>time p​:/bin/perl -wle '@​a=(0..39); $n=shift; $x=0; $x += $a[10] while --$n' 5e6

  real 0m 4.66s   user 0m 4.48s   sys 0m 0.02s with an ActiveState build 626 perl V5.6.1 on a Gateway 750MHz GP7-750\, which is a difference of around 13%.

Mark Leighton Fisher fisherm@​tce.com Thomson multimedia\, Inc. Indianapolis IN "Display some adaptability." -- Doug Shaftoe\, _Cryptonomicon_

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

On Wed\, Feb 27\, 2002 at 01​:53​:54PM -0500\, Fisher Mark wrote​:

Ilya Zakharevich writes​:

time perl -wle "@​a=(0..39); $n=shift; $x=0; $x += $a[10] while --$n" 5e6 time perl -wle "%a=(0..39); $n=shift; $x=0; $x += $a{10} while --$n" 5e6

I get​: $ time perl -wle '@​a=(0..39); $n=shift; $x=0; $x += $a[10] while --$n' 5e6

real    0m34\.03s
user    0m17\.22s
sys     0m0\.01s
$ time perl \-wle '%a=\(0\.\.39\); $n=shift; $x=0; $x \+= $a\{10\} while

--$n' 5e6

real    0m36\.85s
user    0m18\.61s
sys     0m0\.02s

on a Sun Ultra 30 with a perl v5.6.0 compiled with Sun WorkShop Compilers 4.2\, which is around 8% difference in user+sys time.

I get similar results with 5.6.0 (14.68sec vs 16.58sec). Here 5.6.0 is 50% slower with arrays (w.r.t. 5.005_03)\, and almost twice quickier with hashes.

Is it precompiling the HASH function at compile time? Who knows why arrays got so much slower with 5.6.0 - or is it a glitch?

Do not have 5.7.2 to check on Solaris.

I also get​: 13​:50>time p​:/bin/perl -wle '%a=(0..39); $n=shift; $x=0; $x += $a{10} while --$n' 5e6

real    0m 5\.38s
user    0m 5\.06s
sys     0m 0\.03s
13&#8203;:50>time p&#8203;:/bin/perl \-wle '@&#8203;a=\(0\.\.39\); $n=shift; $x=0; $x \+=

$a[10] while --$n' 5e6

real    0m 4\.66s
user    0m 4\.48s
sys     0m 0\.02s

with an ActiveState build 626 perl V5.6.1 on a Gateway 750MHz GP7-750\, which is a difference of around 13%.

Well\, one needs to check the bleeding Perl. On Athlon 850Mhz I get 3.63 vs 2.53\, which is significantly quickier than what you get. But maybe Athlon's are just so much better...

Hmm\, on Athlon 850Mhz​:

  %a @​a @​14577 3.63 2.53 5.6.1 4.20 3.78 5.005_53 2.43 8.66 5.005_03 2.44 9.01

Thus hash access is consistently improving\, while array access (or op dispatch?) is significantly slowed down.

Ilya

p5pRT commented 22 years ago

From @schwern

On Wed\, Feb 27\, 2002 at 02​:29​:36PM -0500\, Ilya Zakharevich wrote​:

Hmm\, on Athlon 850Mhz​:

          %a      @&#8203;a

@​14577 3.63 2.53 5.6.1 4.20 3.78 5.005_53 2.43 8.66 5.005_03 2.44 9.01

Thus hash access is consistently improving\, while array access (or op dispatch?) is significantly slowed down.

Vice-versa?

--

Michael G. Schwern \schwern@&#8203;pobox\.com http​://www.pobox.com/~schwern/ Perl Quality Assurance \perl\-qa@&#8203;perl\.org Kwalitee Is Job One I'm going to have to hurt you on principle.

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

On Wed\, Feb 27\, 2002 at 03​:34​:20PM -0500\, Michael G Schwern wrote​:

On Wed\, Feb 27\, 2002 at 02​:29​:36PM -0500\, Ilya Zakharevich wrote​:

Hmm\, on Athlon 850Mhz​:

          %a      @&#8203;a

@​14577 3.63 2.53 5.6.1 4.20 3.78 5.005_53 2.43 8.66 5.005_03 2.44 9.01

Thus hash access is consistently improving\, while array access (or op dispatch?) is significantly slowed down.

Vice-versa?

Let me redo it​: in loop\, access to

  $a[10] $a{10} $a | op arr hash @​14577 2.54 3.62 2.30 | 0.3 0.54 1.62 5.6.1 3.75 4.21 2.21 | 0.29 1.83 2.29 5.005_53 2.43 8.66 2.11 | 0.27 0.52 6.82
5.005_03 2.36 8.89 2.09 | 0.27 0.54 7.07

"op" is the opcode overhead (1/7 of running time - the loop has 7 opcodes today\, and I assume this is true with all versions). arr/hash are pure access times (with 6 opcodes subtracted).

So what people observe when they claim that hashes are as quick as arrays\, is a bug in 5.6.1 which slowed constant-index array access 3.5 times. Apparently\, this bug is fixed now.

Hope this helps\, Ilya

P.S. All the calculations approximate.

p5pRT commented 22 years ago

From @schwern

On Wed\, Feb 27\, 2002 at 08​:50​:07PM -0500\, Ilya Zakharevich wrote​:

Let me redo it​: in loop\, access to

         $a\[10\]  $a\{10\}    $a    |    op    arr    hash

@​14577 2.54 3.62 2.30 | 0.3 0.54 1.62 5.6.1 3.75 4.21 2.21 | 0.29 1.83 2.29 5.005_53 2.43 8.66 2.11 | 0.27 0.52 6.82
5.005_03 2.36 8.89 2.09 | 0.27 0.54 7.07

"op" is the opcode overhead (1/7 of running time - the loop has 7 opcodes today\, and I assume this is true with all versions). arr/hash are pure access times (with 6 opcodes subtracted).

So what people observe when they claim that hashes are as quick as arrays\, is a bug in 5.6.1 which slowed constant-index array access 3.5 times. Apparently\, this bug is fixed now.

I have no idea where that jump in array access times is coming from. Maybe its an OS/2 thing. Did you use the same compiler for all of them? I can't repeat it here on Debian/PowerPC and I've never noticed it on x86. I also have no idea why you're seeing hashes being so slow on 5.005_03.

Like I said\, this stuff varies wildly from version to version\, compiler to compiler and architecture to architecture. But I can certainly say that the claim of hash/array speeds wasn't because of the array quirk you're getting on OS/2 (unless there's a silent majority of OS/2 users out there? :)

I think the conclusion here is "OS/2 is weird". :)

Using references vs globals vs lexicals can also effect timings. I also trust you didn't actually use the somewhat magical $a?

Running this battery of hash benchmarks​: http​://www.pobox.com/~schwern/src/hashes

here's a comparative table between a few versions of Perl.

All scores are relative to the bleadperl array access times. Lower is better. The extra numbers for bleadperl array access is the CPU times. 5.005_03 benchmark times have been adjusted for the control.

There's some jitter in these benchmarks because I didn't feel like rebooting into single user.

(@​a and %a are actually %hash and @​array and both are lexical)

  bleadperl@​14897 5.6.1 5.005_03 $foo = $a[1] 1.00 2.92 0.97 1.02 $foo = $a[$idx] 1.00 3.02 0.99 1.00 $foo = $a->[1] 1.00 3.49 0.98 0.98 $foo = $a->[$idx] 1.00 3.57 0.98 0.99 $a[1] = 'foo' 1.00 2.79 1.01 1.07 $a[$idx] = 'foo' 1.00 2.80 1.02 0.94 $a->[1] = 'foo' 1.00 3.40 0.98 1.06 $a->[$idx] = 'foo' 1.00 3.45 1.01 1.02

$foo = $a{1} 1.07 1.26 1.26 $foo = $a{$key} 1.21 1.26 1.24 $foo = $a->{1} 1.05 1.25 1.20 $foo = $a->{$key} 1.22 1.24 1.21 $a{1} = 'foo' 1.09 1.31 1.30 $a{$key} = 'foo' 1.30 1.33 1.30 $a->{1} = 'foo' 1.11 1.22 1.27 $a->{$key} = 'foo' 1.26 1.24 1.26

bleadperl is compiled with -O3\, perl's malloc and gcc 2.95.4 5.6.1 and 5.005_03 are both compiled -O6 (-O3)\, normal malloc and a somewhat eariler version of 2.95.4.

The raw benchmark outputs are attached.

--

Michael G. Schwern \schwern@&#8203;pobox\.com http​://www.pobox.com/~schwern/ Perl Quality Assurance \perl\-qa@&#8203;perl\.org Kwalitee Is Job One Thanks\, applied. Or\, ??????k@​???????K^U
as we Bulgarian EBCDIC users like to say.
  -- Jarkko Hietaniemi in \20020130074930\.J2887@&#8203;alpha\.hut\.fi

p5pRT commented 22 years ago

From @schwern

Benchmark​: timing 2000000 iterations of array_get\, array_get_var\, array_set\, array_set_var\, arref_get\, arref_get_var\, arref_set\, arref_set_var\, control\, hash_get\, hash_get_var\, hash_set\, hash_set_var\, href_get\, href_get_var\, href_set\, href_set_var... array_get​: 2 wallclock secs ( 2.92 usr + -0.02 sys = 2.90 CPU) @​ 689655.17/s (n=2000000) array_get_var​: 4 wallclock secs ( 3.02 usr + -0.03 sys = 2.99 CPU) @​ 668896.32/s (n=2000000) array_set​: 3 wallclock secs ( 2.79 usr + -0.01 sys = 2.78 CPU) @​ 719424.46/s (n=2000000) array_set_var​: 2 wallclock secs ( 2.80 usr + 0.02 sys = 2.82 CPU) @​ 709219.86/s (n=2000000) arref_get​: 3 wallclock secs ( 3.49 usr + 0.00 sys = 3.49 CPU) @​ 573065.90/s (n=2000000) arref_get_var​: 4 wallclock secs ( 3.57 usr + -0.01 sys = 3.56 CPU) @​ 561797.75/s (n=2000000) arref_set​: 3 wallclock secs ( 3.40 usr + 0.00 sys = 3.40 CPU) @​ 588235.29/s (n=2000000) arref_set_var​: 3 wallclock secs ( 3.45 usr + 0.00 sys = 3.45 CPU) @​ 579710.14/s (n=2000000)   control​: -1 wallclock secs (-0.33 usr + 0.00 sys = -0.33 CPU) @​ -6060606.06/s (n=2000000)   (warning​: too few iterations for a reliable count)   hash_get​: 3 wallclock secs ( 3.11 usr + -0.02 sys = 3.09 CPU) @​ 647249.19/s (n=2000000) hash_get_var​: 4 wallclock secs ( 3.64 usr + 0.00 sys = 3.64 CPU) @​ 549450.55/s (n=2000000)   hash_set​: 5 wallclock secs ( 3.04 usr + 0.01 sys = 3.05 CPU) @​ 655737.70/s (n=2000000) hash_set_var​: 3 wallclock secs ( 3.63 usr + 0.00 sys = 3.63 CPU) @​ 550964.19/s (n=2000000)   href_get​: 3 wallclock secs ( 3.68 usr + 0.00 sys = 3.68 CPU) @​ 543478.26/s (n=2000000) href_get_var​: 5 wallclock secs ( 4.35 usr + 0.02 sys = 4.37 CPU) @​ 457665.90/s (n=2000000)   href_set​: 4 wallclock secs ( 3.77 usr + -0.03 sys = 3.74 CPU) @​ 534759.36/s (n=2000000) href_set_var​: 4 wallclock secs ( 4.33 usr + 0.00 sys = 4.33 CPU) @​ 461893.76/s (n=2000000)

p5pRT commented 22 years ago

From @schwern

Benchmark​: timing 2000000 iterations of array_get\, array_get_var\, array_set\, array_set_var\, arref_get\, arref_get_var\, arref_set\, arref_set_var\, control\, hash_get\, hash_get_var\, hash_set\, hash_set_var\, href_get\, href_get_var\, href_set\, href_set_var... array_get​: 4 wallclock secs ( 2.83 usr + 0.00 sys = 2.83 CPU) @​ 706713.78/s (n=2000000) array_get_var​: 4 wallclock secs ( 2.98 usr + -0.01 sys = 2.97 CPU) @​ 673400.67/s (n=2000000) array_set​: 2 wallclock secs ( 2.82 usr + -0.01 sys = 2.81 CPU) @​ 711743.77/s (n=2000000) array_set_var​: 2 wallclock secs ( 2.85 usr + 0.02 sys = 2.87 CPU) @​ 696864.11/s (n=2000000) arref_get​: 3 wallclock secs ( 3.41 usr + 0.00 sys = 3.41 CPU) @​ 586510.26/s (n=2000000) arref_get_var​: 5 wallclock secs ( 3.49 usr + -0.00 sys = 3.49 CPU) @​ 573065.90/s (n=2000000) arref_set​: 4 wallclock secs ( 3.33 usr + 0.00 sys = 3.33 CPU) @​ 600600.60/s (n=2000000) arref_set_var​: 3 wallclock secs ( 3.47 usr + 0.03 sys = 3.50 CPU) @​ 571428.57/s (n=2000000)   control​: 0 wallclock secs ( 0.09 usr + -0.00 sys = 0.09 CPU) @​ 22222222.22/s (n=2000000)   (warning​: too few iterations for a reliable count)   hash_get​: 5 wallclock secs ( 3.68 usr + -0.03 sys = 3.65 CPU) @​ 547945.21/s (n=2000000) hash_get_var​: 5 wallclock secs ( 3.80 usr + 0.02 sys = 3.82 CPU) @​ 523560.21/s (n=2000000)   hash_set​: 5 wallclock secs ( 3.66 usr + 0.04 sys = 3.70 CPU) @​ 540540.54/s (n=2000000) hash_set_var​: 3 wallclock secs ( 3.71 usr + 0.00 sys = 3.71 CPU) @​ 539083.56/s (n=2000000)   href_get​: 4 wallclock secs ( 4.35 usr + -0.02 sys = 4.33 CPU) @​ 461893.76/s (n=2000000) href_get_var​: 5 wallclock secs ( 4.42 usr + -0.02 sys = 4.40 CPU) @​ 454545.45/s (n=2000000)   href_set​: 4 wallclock secs ( 4.14 usr + 0.02 sys = 4.16 CPU) @​ 480769.23/s (n=2000000) href_set_var​: 5 wallclock secs ( 4.29 usr + 0.00 sys = 4.29 CPU) @​ 466200.47/s (n=2000000)

p5pRT commented 22 years ago

From @schwern

Benchmark​: timing 2000000 iterations of array_get\, array_get_var\, array_set\, array_set_var\, arref_get\, arref_get_var\, arref_set\, arref_set_var\, control\, hash_get\, hash_get_var\, hash_set\, hash_set_var\, href_get\, href_get_var\, href_set\, href_set_var... array_get​: 8 wallclock secs ( 8.20 usr + 0.01 sys = 8.21 CPU) array_get_var​: 9 wallclock secs ( 8.24 usr + 0.01 sys = 8.25 CPU) array_set​: 9 wallclock secs ( 8.21 usr + 0.02 sys = 8.23 CPU) array_set_var​: 8 wallclock secs ( 7.85 usr + 0.02 sys = 7.87 CPU) arref_get​: 8 wallclock secs ( 8.67 usr + 0.00 sys = 8.67 CPU) arref_get_var​: 9 wallclock secs ( 8.76 usr + 0.01 sys = 8.77 CPU) arref_set​: 8 wallclock secs ( 8.85 usr + 0.00 sys = 8.85 CPU) arref_set_var​: 9 wallclock secs ( 8.77 usr + -0.01 sys = 8.76 CPU)   control​: 4 wallclock secs ( 5.22 usr + 0.02 sys = 5.24 CPU)   hash_get​: 8 wallclock secs ( 9.20 usr + -0.01 sys = 9.19 CPU) hash_get_var​: 8 wallclock secs ( 9.23 usr + 0.00 sys = 9.23 CPU)   hash_set​: 8 wallclock secs ( 9.22 usr + 0.00 sys = 9.22 CPU) hash_set_var​: 8 wallclock secs ( 9.26 usr + 0.00 sys = 9.26 CPU)   href_get​: 9 wallclock secs ( 9.58 usr + 0.01 sys = 9.59 CPU) href_get_var​: 10 wallclock secs ( 9.74 usr + 0.01 sys = 9.75 CPU)   href_set​: 11 wallclock secs ( 9.86 usr + 0.03 sys = 9.89 CPU) href_set_var​: 11 wallclock secs ( 9.90 usr + -0.02 sys = 9.88 CPU)

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

On Wed\, Feb 27\, 2002 at 10​:52​:08PM -0500\, Michael G Schwern wrote​:

         $a\[10\]  $a\{10\}    $a    |    op    arr    hash

@​14577 2.54 3.62 2.30 | 0.3 0.54 1.62 5.6.1 3.75 4.21 2.21 | 0.29 1.83 2.29 5.005_53 2.43 8.66 2.11 | 0.27 0.52 6.82
5.005_03 2.36 8.89 2.09 | 0.27 0.54 7.07

"op" is the opcode overhead (1/7 of running time - the loop has 7 opcodes today\, and I assume this is true with all versions). arr/hash are pure access times (with 6 opcodes subtracted).

So what people observe when they claim that hashes are as quick as arrays\, is a bug in 5.6.1 which slowed constant-index array access 3.5 times. Apparently\, this bug is fixed now.

I have no idea where that jump in array access times is coming from. Maybe its an OS/2 thing. Did you use the same compiler for all of them?

Yes.

I can't repeat it here on Debian/PowerPC and I've never noticed it on x86. I also have no idea why you're seeing hashes being so slow on 5.005_03.

Hashes *are* slow\, period (at least consistently between gcc2.82/intel and SunCC/sparc). The 5.005_03 slowness is consistent too. IMO\, the good speed in the current versions can be explained only by some caching at compile time for constant keys.

Running this battery of hash benchmarks​: http​://www.pobox.com/~schwern/src/hashes

I do not trust any Perl benchmarks - all I saw were measuring benchmarking overhead. I posted the code I ran​:

time perl -wle '%a=(0..39); $n=shift; $x=0; $x += $a{10} while --$n' 5e6 time perl -wle '@​a=(0..39); $n=shift; $x=0; $x += $a[10] while --$n' 5e6 time perl -wle '$a=10; $n=shift; $x=0; $x += $a while --$n' 5e6

Ilya

p5pRT commented 22 years ago

From @schwern

On Thu\, Feb 28\, 2002 at 03​:49​:38PM -0500\, Ilya Zakharevich wrote​:

I can't repeat it here on Debian/PowerPC and I've never noticed it on x86. I also have no idea why you're seeing hashes being so slow on 5.005_03.

Hashes *are* slow\, period (at least consistently between gcc2.82/intel and SunCC/sparc). The 5.005_03 slowness is consistent too. IMO\, the good speed in the current versions can be explained only by some caching at compile time for constant keys.

Running this battery of hash benchmarks​: http​://www.pobox.com/~schwern/src/hashes

I do not trust any Perl benchmarks - all I saw were measuring benchmarking overhead.

No\, they were adjusted using the control (ie. the 5.005_03 results all had 5.24 CPU seconds sutracted from them). The results were so different from yours because...

I posted the code I ran​:

time perl -wle '%a=(0..39); $n=shift; $x=0; $x += $a{10} while --$n' 5e6 time perl -wle '@​a=(0..39); $n=shift; $x=0; $x += $a[10] while --$n' 5e6 time perl -wle '$a=10; $n=shift; $x=0; $x += $a while --$n' 5e6

...interesting things happen with this when you switch from global to lexical (ie. stick a 'my' on front of the above benchmarks).

  bleadperl 5.6.1 5.005_03   $a 8.2 7.2 7.0 my $a 8.2 7.1 7.0   @​a 9.7 13.8 8.4
my @​a 12.4 12.3 10.7   %a 13.9 15.2 39.0 my %a 12.0 13.1 34.5

For some reason\, global array access got much\, much slower in 5.6.1 and was then corrected in bleadperl. Lexical array access got a bit slower and has yet to be corrected.

Comparing lexical arrays vs lexical hashes\, you can see there's not much difference from 5.6.1 on (the above is PowerPC\, where there's almost no difference. On other platforms arrays pull ahead a bit).

So for some reason lexical hash access is a bit faster than global hash access\, and vice-versa for arrays. There may yet be a flaw in lexical array access left over from 5.6.1. I don't know\, poke around and see what you can find. I do know that once the pseudo-hash code is removed\, hashes are going to get even faster.

If you re-run my hashes benchmark suite using globals instead of lexicals it'll likely jive with the results your one-liners are getting.

--

Michael G. Schwern \schwern@&#8203;pobox\.com http​://www.pobox.com/~schwern/ Perl Quality Assurance \perl\-qa@&#8203;perl\.org Kwalitee Is Job One ....let me think it over while Cheese beats you with a baseball bat.

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

Ilya Zakharevich \ilya@&#8203;math\.ohio\-state\.edu writes​:

Hashes *are* slow\, period (at least consistently between gcc2.82/intel and SunCC/sparc). The 5.005_03 slowness is consistent too. IMO\, the good speed in the current versions can be explained only by some caching at compile time for constant keys.

Glad something worked ;-) A expect a major part of the win is the shared key enhancement to return PVs pointing at the key in the HE. Then when subsequent lookups happen we notice pointers are equal and skip the strcmp().

-- Nick Ing-Simmons http​://www.ni-s.u-net.com/

p5pRT commented 22 years ago

From @nwc10

On Fri\, Mar 01\, 2002 at 01​:31​:08AM +0000\, Nick Ing-Simmons wrote​:

Ilya Zakharevich \ilya@&#8203;math\.ohio\-state\.edu writes​:

Hashes *are* slow\, period (at least consistently between gcc2.82/intel and SunCC/sparc). The 5.005_03 slowness is consistent too. IMO\, the good speed in the current versions can be explained only by some caching at compile time for constant keys.

Glad something worked ;-) A expect a major part of the win is the shared key enhancement to return PVs pointing at the key in the HE. Then when subsequent lookups happen we notice pointers are equal and skip the strcmp().

We don't copy the shared hash keys as far as we could. Independent of the other copy-on-write games I experimented with tweaking sv_dup to copy shared key SVs as shared key SVs (rather than making a string copy). There was also something else (hash slices I think) that didn't optimise as much as it could to take advantage of shared hash keys. [it was whatever gets invoked as part of

  %a = (%b\, foo=>bar)

and should have meant that

  &foo(%hash)

  sub foo {   my %args = @​_;   ...   }

would keep the shared hash keys and rebuild %args using them rather than having to re-calculate the hash values from vanilla strings (the output from sv_dup laundering shared hash key SVs)

I hoped that this would give programs using named parameters a slight kick\, but it didn't seem to. ]

However\, I was unable to benchmark any speed improvements (or real slowdowns) caused by any of my changes. ["Obviously" they didn't cause any regression tests to fail] I even tried writing a new test or two for perlbench that should have gone faster with the extra shared hash keys\, but it didn't. However\, as -O2 vs -O3 makes one of the regexp tests vary from 100 down to 80\, I'm really not sure about perlbench.

Nicholas Clark

PS I did send a patch for ... for perl5 to p5p about a year ago\, but it   appears that no-one wants it. :-( -- Even better than the real thing​: http​://nms-cgi.sourceforge.net/

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

On Fri\, Mar 01\, 2002 at 01​:31​:08AM +0000\, Nick Ing-Simmons wrote​:

Ilya Zakharevich \ilya@&#8203;math\.ohio\-state\.edu writes​:

Hashes *are* slow\, period (at least consistently between gcc2.82/intel and SunCC/sparc). The 5.005_03 slowness is consistent too. IMO\, the good speed in the current versions can be explained only by some caching at compile time for constant keys.

Glad something worked ;-) A expect a major part of the win is the shared key enhancement to return PVs pointing at the key in the HE. Then when subsequent lookups happen we notice pointers are equal and skip the strcmp().

I see​:

  time perl -wle "%a=(0..39); $a{30}=shift; $x=0; $a{20} += $a{10} while --$a{30}" 5e6   time perl -wle "@​a=(0..39); $a[30]=shift; $x=0; $a[20] += $a[10] while --$a[30]" 5e6

gives much more reasonable answer​: 6.32sec vs 2.82sec. So it is nowhere close to 15%\, it is above 2x.

Ilya

p5pRT commented 22 years ago

From @schwern

On Sat\, Mar 02\, 2002 at 06​:08​:51AM -0500\, Ilya Zakharevich wrote​:

On Fri\, Mar 01\, 2002 at 01​:31​:08AM +0000\, Nick Ing-Simmons wrote​:

Ilya Zakharevich \ilya@&#8203;math\.ohio\-state\.edu writes​:

Hashes *are* slow\, period (at least consistently between gcc2.82/intel and SunCC/sparc). The 5.005_03 slowness is consistent too. IMO\, the good speed in the current versions can be explained only by some caching at compile time for constant keys.

Glad something worked ;-) A expect a major part of the win is the shared key enhancement to return PVs pointing at the key in the HE. Then when subsequent lookups happen we notice pointers are equal and skip the strcmp().

I see​:

time perl -wle "%a=(0..39); $a{30}=shift; $x=0; $a{20} += $a{10} while --$a{30}" 5e6 time perl -wle "@​a=(0..39); $a[30]=shift; $x=0; $a[20] += $a[10] while --$a[30]" 5e6

gives much more reasonable answer​: 6.32sec vs 2.82sec. So it is nowhere close to 15%\, it is above 2x.

This must have gotten lost in the last email. **---->>> TRY IT WITH LEXICALS \<\<\<----**

Also try it with variable instead of fixed indexes.

The following shows that yes\, global arrays with fixed indexes are much faster than hashes (not a surprise) but in all other situations they become slower by almost half while hashes remain pretty much the same.

It plays out like this on an optimized bleadperl@​14897 on PowerPC.

  hash array global\, fixed 22.1 9.9 lexical\, fixed 19.6 17.6 global\, var 25.7 20.7 lexical\, var 23.4 18.2

This might mean there's something really wrong with lexical arrays. Or that there's something really right about global arrays with fixed indexes. I don't know. But it does show you can't just benchmark things one way. TMTOWTB. :)

Here are the raw results\, using variations on your original (I presume the $x=0 is a typo).

$ time bleadperl -wle '%a=(0..39); $a{30}=shift; $a{20} += $a{10} while --$a{30}' 5e6

real 0m22.723s user 0m22.080s sys 0m0.160s

$ time bleadperl -wle '@​a=(0..39); $a[30]=shift; $a[20] += $a[10] while --$a[30]' 5e6

real 0m9.989s user 0m9.930s sys 0m0.020s

$ time bleadperl -wle 'my %a=(0..39); $a{30}=shift; $a{20} += $a{10} while --$a{30}' 5e6

real 0m19.686s user 0m19.610s sys 0m0.030s

$ time bleadperl -wle 'my @​a=(0..39); $a[30]=shift; $a[20] += $a[10] while --$a[30]' 5e6

real 0m17.700s user 0m17.610s sys 0m0.040s

$ time bleadperl -wle 'my($i\,$j\,$k) = (10\,20\,30); @​a=(0..39); $a[$k]=shift; $a[$j] += $a[$i] while --$a[$k]' 5e6

real 0m20.713s user 0m20.650s sys 0m0.050s

$ time bleadperl -wle 'my($i\,$j\,$k) = (10\,20\,30); %a=(0..39); $a{$k}=shift; $a{$j} += $a{$i} while --$a{$k}' 5e6

real 0m26.443s user 0m25.720s sys 0m0.100s

$ time bleadperl -wle 'my($i\,$j\,$k) = (10\,20\,30); my @​a=(0..39); $a[$k]=shift; $a[$j] += $a[$i] while --$a[$k]' 5e6

real 0m19.041s user 0m18.240s sys 0m0.120s

$ time bleadperl -wle 'my($i\,$j\,$k) = (10\,20\,30); my %a=(0..39); $a{$k}=shift; $a{$j} += $a{$i} while --$a{$k}' 5e6

real 0m23.535s user 0m23.400s sys 0m0.050s

--

Michael G. Schwern \schwern@&#8203;pobox\.com http​://www.pobox.com/~schwern/ Perl Quality Assurance \perl\-qa@&#8203;perl\.org Kwalitee Is Job One gleam comes to my eyes as I combine pure water and triticale.   -- mjd

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

On Sat\, Mar 02\, 2002 at 04​:55​:29PM -0500\, Michael G Schwern wrote​:

time perl -wle "%a=(0..39); $a{30}=shift; $x=0; $a{20} += $a{10} while --$a{30}" 5e6 time perl -wle "@​a=(0..39); $a[30]=shift; $x=0; $a[20] += $a[10] while --$a[30]" 5e6

gives much more reasonable answer​: 6.32sec vs 2.82sec. So it is nowhere close to 15%\, it is above 2x.

This must have gotten lost in the last email. **---->>> TRY IT WITH LEXICALS \<\<\<----**

Why? I have no interest in slowing things down...

Also try it with variable instead of fixed indexes.

What for? We are discussing object fields access.

                hash        array

global\, fixed 22.1 9.9 lexical\, fixed 19.6 17.6   ^^^^

This is a bug which needs to be fixed ASAP. Apparently\, aryelt_fast (or whatever is the name of this opcode) does not have a lexical counterpart.

Ilya

p5pRT commented 22 years ago

From @schwern

On Sun\, Mar 03\, 2002 at 08​:16​:57PM -0500\, Ilya Zakharevich wrote​:

On Sat\, Mar 02\, 2002 at 04​:55​:29PM -0500\, Michael G Schwern wrote​:

time perl -wle "%a=(0..39); $a{30}=shift; $x=0; $a{20} += $a{10} while --$a{30}" 5e6 time perl -wle "@​a=(0..39); $a[30]=shift; $x=0; $a[20] += $a[10] while --$a[30]" 5e6

gives much more reasonable answer​: 6.32sec vs 2.82sec. So it is nowhere close to 15%\, it is above 2x.

This must have gotten lost in the last email. **---->>> TRY IT WITH LEXICALS \<\<\<----**

Why? I have no interest in slowing things down...

If you don't look at all the different cases [continued...]

Also try it with variable instead of fixed indexes.

What for? We are discussing object fields access.

Which are usually lexical. :)

Anyhow\, I've long since gone past worrying about just that one narrow situtation.

global\, fixed 22.1 9.9 lexical\, fixed 19.6 17.6 ^^^^

This is a bug which needs to be fixed ASAP. Apparently\, aryelt_fast (or whatever is the name of this opcode) does not have a lexical counterpart.

[...here] you don't find out interesting things like this. :)

--

Michael G. Schwern \schwern@&#8203;pobox\.com http​://www.pobox.com/~schwern/ Perl Quality Assurance \perl\-qa@&#8203;perl\.org Kwalitee Is Job One Good tidings\, my native American Indian friend! America will soon again be yours! Please accept 5th Avenue as an initial return!

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

On Sun\, Mar 03\, 2002 at 11​:11​:03PM -0500\, Michael G Schwern wrote​:

Also try it with variable instead of fixed indexes.

What for? We are discussing object fields access.

Which are usually lexical. :)

Do you want me to start a c.l.p.* thread "Do not use lexical arrays"? 1/2 ;-)

Ilya

p5pRT commented 21 years ago

From @jhi

A *lot* of interesting speed discussion in the thread... however\, the original issue has been fixed by Math​::BigInt having the '​:constant' export tag\, which makes it use the constant-sniffing code\, resulting in the expected​:

expect = 1.2345678901234567 or so binary = 1.2345678901234567 perl = 1.2345678901234567 bigfl = 1.2345678901234567

So I'm marking this particular problem ticket as resolved.

p5pRT commented 21 years ago

@jhi - Status changed from 'open' to 'resolved'