Bears-R-Us / arkouda

Arkouda (αρκούδα): Interactive Data Analytics at Supercomputing Scale :bear:
Other
250 stars 89 forks source link

Value mismatch with numpy on `int(arr) // -float(arr)` #2689

Open stress-tess opened 1 year ago

stress-tess commented 1 year ago

While working #2688, I updated the float array to include negatives which resulted in the following value mismatch with numpy. I don't see the mismatch at a glance, but I'll dig into it more after all the tests are converted

Value mismatches:         1 / 591
int64(array) // float64(array): np: [-0.000e+00 -1.000e+00 -2.000e+00 -2.000e+00 -3.000e+00 -3.000e+00
 -4.000e+00 -5.000e+00 -5.000e+00 -6.000e+00 -7.000e+00 -8.000e+00
 -8.000e+00 -9.000e+00 -1.000e+01 -1.100e+01 -1.200e+01 -1.300e+01
 -1.500e+01 -1.600e+01 -1.700e+01 -1.900e+01 -2.000e+01 -2.200e+01
 -2.400e+01 -2.600e+01 -2.800e+01 -3.000e+01 -3.300e+01 -3.600e+01
 -3.900e+01 -4.200e+01 -4.600e+01 -5.000e+01 -5.500e+01 -6.000e+01
 -6.700e+01 -7.400e+01 -8.200e+01 -9.200e+01 -1.050e+02 -1.200e+02
 -1.390e+02 -1.640e+02 -1.990e+02 -2.480e+02 -3.260e+02 -4.660e+02
 -7.930e+02 -2.426e+03  2.474e+03  8.410e+02  5.140e+02  3.740e+02
  2.960e+02  2.470e+02  2.130e+02  1.880e+02  1.680e+02  1.530e+02
  1.410e+02  1.310e+02  1.220e+02  1.150e+02  1.090e+02  1.030e+02
  9.800e+01  9.400e+01  9.000e+01  8.700e+01  8.400e+01  8.100e+01
  7.900e+01  7.600e+01  7.400e+01  7.200e+01  7.000e+01  6.900e+01
  6.700e+01  6.600e+01  6.400e+01  6.300e+01  6.200e+01  6.100e+01
  6.000e+01  5.900e+01  5.800e+01  5.700e+01  5.600e+01  5.500e+01
  5.400e+01  5.400e+01  5.300e+01  5.200e+01  5.200e+01  5.100e+01
  5.100e+01  5.000e+01  5.000e+01  4.900e+01]
ak: [-0.000e+00 -1.000e+00 -2.000e+00 -2.000e+00 -3.000e+00 -3.000e+00
 -4.000e+00 -5.000e+00 -5.000e+00 -6.000e+00 -7.000e+00 -8.000e+00
 -8.000e+00 -9.000e+00 -1.000e+01 -1.100e+01 -1.200e+01 -1.300e+01
 -1.500e+01 -1.600e+01 -1.700e+01 -1.900e+01 -2.000e+01 -2.200e+01
 -2.400e+01 -2.600e+01 -2.800e+01 -3.000e+01 -3.300e+01 -3.600e+01
 -3.900e+01 -4.200e+01 -4.600e+01 -5.000e+01 -5.500e+01 -6.000e+01
 -6.600e+01 -7.400e+01 -8.200e+01 -9.200e+01 -1.050e+02 -1.200e+02
 -1.390e+02 -1.640e+02 -1.990e+02 -2.480e+02 -3.260e+02 -4.660e+02
 -7.930e+02 -2.426e+03  2.474e+03  8.410e+02  5.140e+02  3.740e+02
  2.960e+02  2.470e+02  2.130e+02  1.880e+02  1.680e+02  1.530e+02
  1.410e+02  1.310e+02  1.220e+02  1.150e+02  1.090e+02  1.030e+02
  9.800e+01  9.400e+01  9.000e+01  8.700e+01  8.400e+01  8.100e+01
  7.900e+01  7.600e+01  7.400e+01  7.200e+01  7.000e+01  6.900e+01
  6.700e+01  6.600e+01  6.400e+01  6.300e+01  6.200e+01  6.100e+01
  6.000e+01  5.900e+01  5.800e+01  5.700e+01  5.600e+01  5.500e+01
  5.400e+01  5.400e+01  5.300e+01  5.200e+01  5.200e+01  5.100e+01
  5.100e+01  5.000e+01  5.000e+01  4.900e+01]
stress-tess commented 1 year ago

wow, so these only differ in one spot. Index 36

Numpy has -67 and arkouda has -66

stress-tess commented 1 year ago

The fraction represented by the linspace at position 36 is -6/11, so this ends up being 36 // (-6/11) which seems like it would divide evenly to 66, but the floor division gives 67. Python and numpy agree on this

>>> 36 / (-6/11)
-66.0

>>> 36 // (-6/11)
-67.0

>>> 36 // ak.array([-0.5454545454545454])
array([-66])

So it makes sense we are getting this wrong because we do https://github.com/Bears-R-Us/arkouda/blob/e574712ed9058f4ef51549aba06748c8de0bee4c/src/BinOp.chpl#L29

stress-tess commented 1 year ago

This seems to be a numeric precision issue when dividing by an infinite decimal. I found this much simpler example which shows this:

>>> 1 // (-1/3)
-4.0

>>> 1 / (-1/3)
-3.0

I guess we just need to figure out what the c function used for integer division and use that

stress-tess commented 1 year ago

So according to this stack overflow answer, it's not really floor divide but round towards zero. I'm hoping updating the one line to be trunc(numerator/denom); will do the trick