Open asishm opened 4 days ago
Looking at it a bit more, isinstance(pd.Timestamp(...), datetime)
returns True
and same with isinstance(pd.Timedelta(...), timedelta)
, so maybe it's moot.
I don't see a very strong reason to prefer one over the other. When doing inference for constructors, we store these as datetime.date
objects. E.g.
import datetime
print(type(pd.Series(datetime.date(2024, 3, 10)).iloc[0]))
# <class 'datetime.date'>
In addition, openpyxl has been around much longer than calamine. Both of these suggest to me we should return the Python objects.
Pandas version checks
[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pandas.
[X] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
When trying to read an excel file that has mixed formats, using the calamine engine - results in an object dtype column where the value is
pandas._libs.tslibs.timestamps.Timestamp
andpandas._libs.tslibs.timedeltas.Timedelta
whereas with the openpyxl engine, these values aredatetime.datetime
anddatetime.timedelta
instead.The difference seems to be because the calamine implementation explicitly sets
pd.Timestamp/pd.Timedelta
data types https://github.com/pandas-dev/pandas/blob/79067a76adc448d17210f2cf4a858b0eb853be4c/pandas/io/excel/_calamine.py#L109-L115whereas the openpyxl implementation uses whatever the library returns.
Expected Behavior
The individual item data types should match
Installed Versions