Closed GoogleCodeExporter closed 9 years ago
I partially agree. The nsum-like syntax could be removed from fsum, and nsum
could be
given support for finite sums.
However, there is a use for a standalone fsum(<iterable>), because it has
significantly less overhead than nsum. This function could also be optimized to
be
even faster than sum() by avoiding temporary normalizations.
Another catch is that nsum has slightly different semantics. It discards the
tail of
a series when the terms are small enough, and could be expected to do the same
for
monotonically decreasing finite sums. In effect, nsum should be able try
computing
nsum(f,[a,inf])-nsum(f,[b+1,inf]).
Original comment by fredrik....@gmail.com
on 11 Dec 2008 at 4:57
Ok. But at least nsum(f, [a, b]) should just work.
For speed, there could be a finite=True flag for nsum, so we could drop fsum.
BTW, for the sum of a list it's numerically better to sort it before summing
(small
values first).
Original comment by Vinzent.Steinberg@gmail.com
on 11 Dec 2008 at 5:22
> we could drop fsum
I don't think so. One reason being compatibility with the fsum function in the
standard math library.
> BTW, for the sum of a list it's numerically better to sort it before summing
(small
> values first).
Sorting is much more expensive than calculating the sum in the first place,
though.
Avoiding temporary normalization/rounding simultaneously improves speed and
gives
full accuracy.
Original comment by fredrik....@gmail.com
on 11 Dec 2008 at 5:30
How would this give full precision in case you start with the largest value?
Original comment by Vinzent.Steinberg@gmail.com
on 11 Dec 2008 at 5:40
If no rounding is performed, the result is exact by definition :)
Original comment by fredrik....@gmail.com
on 11 Dec 2008 at 5:43
This way 10**6 + 10**-6 would eat much memory, if you adapt precision, no?
Original comment by Vinzent.Steinberg@gmail.com
on 11 Dec 2008 at 5:47
Yes, in practice one has to set an upper threshold for the permitted precision.
Original comment by fredrik....@gmail.com
on 11 Dec 2008 at 5:51
So it's not exact. :)
And you could get a less accurate result when summing big values first.
Original comment by Vinzent.Steinberg@gmail.com
on 11 Dec 2008 at 5:55
Well, you can still get exactness without losing speed by summing into
"buckets" for
steps of exponent sizes. Then in effect you only have to sort a (typically
short)
list of reference exponents. But the implementation complexity increases.
I think the "sum in reverse order" trick is mostly irrelevant for multiprecision
arithmetic anyway. As I understand it only helps against rounding errors, which
can
be circumvented by adding log2(len(terms)) bits of temporary precision. It
doesn't
help with cancellation as when calculating 10**6 - 10**6 + 10**-6.
Original comment by fredrik....@gmail.com
on 11 Dec 2008 at 6:08
Here is a prototype implementation of a fast fsum(). If L = [mpf(1)/k for k in
range(1,100)], fsum(L) is 4.5x faster than sum(L). So this might speed up code
that
adds a lot of terms, e.g. matrix multiplication or quadrature.
At the cost of some spaghetti code explosion, it could be extended to handle
more
cases, e.g. also doing complex numbers inline.
Original comment by fredrik....@gmail.com
on 11 Dec 2008 at 6:18
Attachments:
It is almost 6x faster with psyco, btw. I should note that I have not tested
the code
except for that one case.
Original comment by fredrik....@gmail.com
on 11 Dec 2008 at 6:21
Nice, it will speed up matrix norms and pivoting a lot. Matrix multiplication
is at
the moment an element-wise loop I think, it should probably be rewritten to use
generators instead.
To be honest, I don't trust the "sum in reverse order" trick. I'd have to see it
performing better in a real life example.
Original comment by Vinzent.Steinberg@gmail.com
on 11 Dec 2008 at 6:42
In fsum.py fzero should be replaced by MP_ZERO.
fsum can be made a bit faster in the case of a list of mpf; see e.g. the
attached file,
which was derived from mpf_dot in matrix.py in issue 46.
mpf_dot would also speed up matrix multiplication, as mentioned in a comment in
matrices.py .
Original comment by mario.pe...@gmail.com
on 12 Dec 2008 at 10:20
Attachments:
Hopefully fixed in r827.
I had to make some further changes to avoids bugs when the bitcounts are highly
variable.
fsum should support non-mpmathified arguments with no difference in semantics,
and
also needs to handle complex numbers, so the overhead elimination in add3 can't
be
used. The present version is a bit slower. But when dealing with pure mpf
instances,
one can use the mpf_sum function directly which is as fast as add3.
>>> from mpmath import *
>>> for n in [1,2,3,10,100,1000]:
... v = linspace(0,1,n)
... print n, timing(sum, v) / timing(fsum, v)
...
1 0.759206798867
2 1.05194805195
3 1.50458715596
10 2.67509363296
100 4.82815702349
1000 5.3417514998
Original comment by fredrik....@gmail.com
on 10 Jan 2009 at 1:24
Thank you, I just modified matrices.py and linalg.py to use fsum().
>>> nsum(lambda n: 1./n, (1, 10))
mpf('2.9289682539682538')
I think this is fixed now.
Original comment by Vinzent.Steinberg@gmail.com
on 10 Jan 2009 at 5:43
Original issue reported on code.google.com by
Vinzent.Steinberg@gmail.com
on 11 Dec 2008 at 4:44