pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.62k stars 17.91k forks source link

BUG: Rounding timedelta to 3 decimal places does not work correctly and is unstable. #58983

Closed madeinjapan closed 4 months ago

madeinjapan commented 4 months ago

Pandas version checks

Reproducible Example

from pandas import Timedelta

td = Timedelta('0 days 00:00:00.000500')
td
# Timedelta('0 days 00:00:00.000500')

td.round('ms')
# Timedelta('0 days 00:00:00') <- the round is incorrect

td_new = Timedelta('0 days 00:00:00.000501')    # added .000001
td_new.round('ms')
# Timedelta('0 days 00:00:00.001000')

# With builtin round
round(0.000500, 3)
# 0.001

# Another strange situation with pandas
td_strange = Timedelta('0 days 00:00:00.001500')    # <- change 000500 to 001500
td_strange.round('ms')
# Timedelta('0 days 00:00:00.002000') <- looks OK

# but with
td_strange1 = Timedelta('0 days 00:00:00.002500')    # <- change 001500 to 002500
td_strange1.round('ms')
# Timedelta('0 days 00:00:00.002000') <- looks incorrect

# and with builtin round
round(1.000500, 3)
# 1.0
round(2.000500, 3)
# 2.001

Issue Description

Rounding does not work for values ​​ending in 5. Based on the rounding rules, we round up from 5. Pandas does this in some cases and not in others. Unstable

Expected Behavior

Correct application of rounding rules, or an alternative solution for fast rounding of large data (> 1 million).

Installed Versions

INSTALLED VERSIONS ------------------ commit : d9cdd2ee5a58015ef6f4d15c7226110c9aab8140 python : 3.11.6.final.0 python-bits : 64 OS : Windows OS-release : 10 Version : 10.0.22631 machine : AMD64 processor : Intel64 Family 6 Model 186 Stepping 3, GenuineIntel byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : English_United States.1252 pandas : 2.2.2 numpy : 1.25.2 pytz : 2023.3.post1 dateutil : 2.8.2 setuptools : 65.5.0 pip : 23.2.1 Cython : None pytest : 7.4.2 hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : None IPython : 8.16.1 pandas_datareader : None adbc-driver-postgresql: None adbc-driver-sqlite : None bs4 : None bottleneck : None dataframe-api-compat : None fastparquet : None fsspec : None gcsfs : None matplotlib : 3.7.2 numba : None numexpr : 2.10.0 odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pyreadstat : None python-calamine : None pyxlsb : None s3fs : None scipy : 1.11.1 sqlalchemy : None tables : 3.9.2 tabulate : None xarray : None xlrd : None zstandard : None tzdata : 2023.3 qtpy : None pyqt5 : None
mroeschke commented 4 months ago

Thanks for the issue but this is the expected behavior. pandas, like python rounding, implements rounding half to even (aka bankers rounding), so closing as the expected behavior

madeinjapan commented 4 months ago

I did not know this. I learned something again. Thanks