problem with summing large array of float32 (migrated from Trac #924)

thouis commented 12 years ago

Original ticket http://projects.scipy.org/numpy/ticket/924 Reported 2008-10-03 by trac user emil, assigned to unknown.

In medical imaging large arrays of single floats are often used. In summing over such an array I discovered the following problem: a=ones([840,2200,60],float32) sum(a.flatten()) yields 16777216 (4096^2) ???

However, sum(sum(sum(a,axis=0),axis=0),axis=0) yields 110880000 the correct result. The problem doesn't occur with float64 dot(a.flatten() , a.flatten() ) also gives 16777216

I don't think the problem lies only with numpy. I tried to circumvent the problem by writing my own summation in fortran (gfortran) and using f2py. The following fortran snippet gave me also incorrect results

   Real*4 sum

   sum = 0.

   do i = 1,840
         do j = 1,2200
                do k = 1,60
                        sum = sum +1.0
                end do
          end do
   end do

sum ends up being 16777216 again! the incorrect result! on the other hand the following works: Real*4 sum1, sum

   sum = 0.

   do i = 1,840
         sum1 = 0.
         do j = 1,2200
                do k = 1,60
                        sum1 = sum1 +1.0
                end do
          end do
         sum = sum+sum1
   end do

The same problem seems to come up in matlab.

My computer has amd64's (opteron) and is running suse linux

thouis commented 12 years ago

Comment in Trac by atmention:rkern, 2008-10-03

Yup, that's floating point arithmetic for you. At 32-bit precision, adding 1 to 16777216 gives you 16777216 back again because 16777217 cannot be represented at 32-bit precision.

In [1]: from numpy import *

In [2]: x = float32(16777216)

In [3]: one = float32(1.0)

In [4]: x
Out[4]: 16777216.0

In [5]: x + one
Out[5]: 16777216.0

With the multiple sums across each axis, you are accumulating separate larger numbers then adding them together. Fortunately, the intermediates and the final result are multiples of reasonable powers of 2, so you can afford to drop off some of the low bits. If the final result were something like 110880003, that number cannot even be represented in 32-bit precision.

You can use {{{sum(a, dtype=float64)}}} to use a double-precision accumulator for the sum and thus be able to be more accurate for larger inputs.

thouis commented 12 years ago

Comment in Trac by trac user emil, 2008-10-03

Thanks for clarifying the issue, I should have realized that it was round-off error especially after I went to fortran.

But I still would ask that a change be made so that accumulators for sum, dot, or other similar functions by default be float64 (note that dot doesn't seem to have the option to change the type of the accumulator).

Here's why: For medical imaging, we use large arrays of single-precision to save space. These large arrays are not sparse, and each entry has similar values (for example x-ray attenuation coefficients will not vary greatly over a volume). There are processing operations used in computer-aided diagnosis that involve summing the array or dotting the array with another. The error introduced by having a single-precision accumulator can be large as I found out. As a user of a high-level package such as matlab or numpy, one generally doesn't expect this kind of error with summing over values that do not alternate in sign. I have checked with my colleagues on the floor, and nobody suspected this problem. Although they understand it, when I explain what is happening. Some are wondering now if they have an error in previous work.

The above example with the sum result being 110880003: If the sum result is 110880003, and I get 110880000, I'm happy, because the answer I got is reasonable for float32.

My above example is extreme. Summing over a float32 array (840,2200,60) of ones yields 16,777,216 instead of 110,880,000 . This answer is way off, so I would probably suspect something is wrong.

The problem, however, is that this type of error can lead to computational errors on the order of a few percent, which is inaccurate enough to cause problems, but not inaccurate enough for the problem to be easily detected. In fact, the situation were I caught the problem, the sum result was only off by 4% .

thouis commented 12 years ago

Comment in Trac by atmention:charris, 2008-10-03

This issue was extensively discussed on the numpy-dev list back in the day. If you want to raise the issue again, that is where to do it. Please don't reopen the ticket just to express a difference of opinion.

thouis / numpy-trac-migration

problem with summing large array of float32 (migrated from Trac #924) #1531