man-group / ArcticDB

ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem.
http://arcticdb.io
Other
1.46k stars 93 forks source link

Enhancement 1792: Use float64 as the result type for all division operations in the processing pipeline #1794

Closed alexowens90 closed 1 month ago

alexowens90 commented 1 month ago

Reference Issues/PRs

Closes #1792 Fixes #1791

What does this implement or fix?

Previously, dividing integers by integers in the processing pipeline (e.g. with apply) would perform integer division. This is true for all of column/column, column/value, and value/column. This is not the same as the behaviour of Pandas or Polars. Worse, if the denominator was zero, this would cause a floating point exception on linux, and an integer overflow exception (#1791) on Windows.

This change makes the result a float64 in all cases. If we want the previous behaviour to be accessible again, we should implement the integer division operator // in Python.