Open benmatwil opened 4 years ago
A slightly tweaked version of the original report:
import pandas as pd
s1 = pd.Series([], dtype='datetime64[ns, UTC]')
df1 = pd.concat([s1, s1], axis=1) # empty dataframe of two columns of datetime dtype
# want the minimum of the two datetime columns element wise
min_dates_1 = df1.min(axis=1) # output series is of dtype float64
print(min_dates_1.dtype)
s2 = pd.Series(pd.to_datetime([f'2020-01-0{i}' for i in range(1, 10)], utc=True))
df2 = pd.concat([s2, s2], axis=1) # non-empty dataframe of two columns of datetime dtype
min_dates_2 = df2.min(axis=1) # output series is of dtype datetime64[ns, UTC]
print(min_dates_2.dtype)
Also - I still get this on the master version of pandas.
Code Sample, a copy-pastable example if possible
Problem description
When dataframe is empty, doing a
.min(axis=1)
outputs a series of dtypefloat64
even though all columns were dtypedatetime64
. Should this be consitent on dtypes and outputmin_dates
with dtypedatetime64[ns, UTC]
.This is only an issue when the dataframe is empty. If the dataframe is not empty, the dtype is conserved and min_dates outputs with dtype
datetime64[ns, UTC]
.Output of
pd.show_versions()