Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pandas.
[ ] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
x = pd.Timestamp("20220101") # datetime64[s]
y = pd.Timestamp("2022-01-01T00:00:00.000000000") # datetime64[ns]
df = pd.DataFrame({"x":x, "y":y}, index=list(range(5)))
store = pd.HDFStore("store.h5", 'w')
store.put('data', df, format='table')
df2 = store.get('data')
store.close()
df.equals(df2) # < Error here. df should equal df2
Issue Description
df.equals(df2) = False
Should be True.
When a dataframe containing datetime64[s] (i.e. column "x") is saved to HDFstore and retrieved, the values change.
Specifically, they are 10^9 times smaller.
It looks like the HDFStore assumes datetime data is in nanoseconds (datetime64[ns])
Expected Behavior
df.equals(df2) = True.
Works with pandas 1.5.3
Installed Versions
INSTALLED VERSIONS
commit : d9cdd2ee5a58015ef6f4d15c7226110c9aab8140
python : 3.10.11.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.22631
machine : AMD64
processor : Intel64 Family 6 Model 140 Stepping 1, GenuineIntel
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : English_United States.1252
Pandas version checks
[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pandas.
[ ] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
df.equals(df2) = False Should be True.
When a dataframe containing datetime64[s] (i.e. column "x") is saved to HDFstore and retrieved, the values change. Specifically, they are 10^9 times smaller. It looks like the HDFStore assumes datetime data is in nanoseconds (datetime64[ns])
Expected Behavior
df.equals(df2) = True.
Works with pandas 1.5.3
Installed Versions
INSTALLED VERSIONS
commit : d9cdd2ee5a58015ef6f4d15c7226110c9aab8140 python : 3.10.11.final.0 python-bits : 64 OS : Windows OS-release : 10 Version : 10.0.22631 machine : AMD64 processor : Intel64 Family 6 Model 140 Stepping 1, GenuineIntel byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : English_United States.1252
pandas : 2.2.2 numpy : 1.26.4 pytz : 2024.1 dateutil : 2.9.0.post0 setuptools : 70.1.1 pip : 24.1 Cython : None pytest : 8.2.2 hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : 3.1.4 IPython : 8.25.0 pandas_datareader : None adbc-driver-postgresql: None adbc-driver-sqlite : None bs4 : 4.12.3 bottleneck : 1.4.0 dataframe-api-compat : None fastparquet : 2024.5.0 fsspec : 2024.6.0 gcsfs : None matplotlib : None numba : None numexpr : 2.10.1 odfpy : None openpyxl : 3.1.0 pandas_gbq : None pyarrow : 16.1.0 pyreadstat : None python-calamine : None pyxlsb : None s3fs : None scipy : None sqlalchemy : None tables : 3.9.2 tabulate : None xarray : None xlrd : 2.0.1 zstandard : None tzdata : 2024.1 qtpy : None pyqt5 : None