flintlib / arb

Arb has been merged into FLINT -- use https://github.com/flintlib/flint/ instead
http://arblib.org/
GNU Lesser General Public License v2.1
456 stars 137 forks source link

add profiling code for arb_const_pi #387

Closed tthsqe12 closed 2 years ago

tthsqe12 commented 2 years ago

This is a fun exercise in pushing the pi code. It is fairly careful crafted, but it still may or may not work or be correct.

fredrik-johansson commented 2 years ago

It is fairly careful crafted, but it still may or may not work or be correct.

What parts are sketchy?

It's missing an error bound, right? (But that is easy to add.)

fredrik-johansson commented 2 years ago

This looks really good. I can take it as a birthday present? :-)

My only concern is that the memory usage is higher. For 100M digits and one thread, the resident peak is 1.2x higher, virtual is 2.2x higher. Two threads, 1.7x and 2.7x respectively. Can we do anything to save memory?

(It's also little bit slower for prec <= 4096, but that's fine since we read pi from a static table there anyway.)

tthsqe12 commented 2 years ago

Can confirm the memory usage is high. BTW, why do you have 71 CI jobs? :)

tthsqe12 commented 2 years ago

memory usage is a bit better now, although I don't know how it fares now at low precision. a typical run of pi_here (first) vs arb_const_pi_chudnovsky_eval (second)

virt/peak/res/peak(MB): 21.12 336.54 6.12 271.90
virt/peak/res/peak(MB): 20.96 262.55 6.79 248.13

the second number in the first row was ~572.94 before for me.

fredrik-johansson commented 2 years ago

Fabrice Bellard claims to have some ways to save memory in https://bellard.org/pi/pi2700e9/pipcrecord.pdf - I presume you already read this?

For speed, there is also the trick of precomputing a table of powers of the argument of the hypergeometric series. I've used this for a few other binary splitting functions and it gives a small speedup (not sure how significant here).

tthsqe12 commented 2 years ago

It has one fewer bug now (ui_factor_mulpow_inplace), and the majority of the code is reusable general-purpose parts now. Not sure what the point of precomputing some powers of the argument is as they are walking around factored anyways. Maybe we can use the fact that the final denominator is known...

fredrik-johansson commented 2 years ago

You no longer want to work on this?