pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
42.62k stars 17.57k forks source link

BUG: `DatetimeIndex.union` gives wrong result with "datetime64[us]" #59036

Closed tehunter closed 1 week ago

tehunter commented 2 weeks ago

Pandas version checks

Reproducible Example

l1 = pd.DatetimeIndex(['2024-05-11', '2024-05-12'], dtype='datetime64[us]', name='Date', freq='D')
l2 = pd.DatetimeIndex(['2024-05-13'], dtype='datetime64[us]', name='Date', freq='D')

print(l1.union(l2))
# Returns DatetimeIndex(['2024-05-11', '2024-05-13', '2027-02-05'], dtype='datetime64[us]', name='Date', freq='D')

Issue Description

DatetimeIndex.union is returning an incorrect result. Since this method is used by MultiIndex.concat, it leads to unexpected errors when combining several MultiIndex DataFrames/Series that have a datetime64[us] level.

Expected Behavior

DatetimeIndex(['2024-05-11', '2024-05-12', '2024-05-13'], dtype='datetime64[us]', name='Date', freq=None)

Installed Versions

INSTALLED VERSIONS ------------------ commit : d9cdd2ee5a58015ef6f4d15c7226110c9aab8140 python : 3.11.9.final.0 python-bits : 64 OS : Windows OS-release : 10 Version : 10.0.22631 machine : AMD64 processor : Intel64 Family 6 Model 141 Stepping 1, GenuineIntel byteorder : little LC_ALL : None LANG : None LOCALE : English_United States.1252 pandas : 2.2.2 numpy : 1.26.4 pytz : 2024.1 dateutil : 2.9.0 setuptools : 69.5.1 pip : 24.0 Cython : None pytest : 7.4.2 hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : 3.2.0 lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : 3.1.4 IPython : 8.25.0 pandas_datareader : None adbc-driver-sqlite : None bs4 : None bottleneck : 1.3.8 dataframe-api-compat : None fastparquet : None fsspec : 2024.6.0 gcsfs : None matplotlib : None numba : 0.59.1 numexpr : 2.10.0 odfpy : None openpyxl : 3.1.3 pandas_gbq : None pyarrow : 16.1.0 pyreadstat : 1.2.7 python-calamine : None pyxlsb : 1.0.10 s3fs : None scipy : None sqlalchemy : 2.0.30 tables : None tabulate : 0.9.0 xarray : None xlrd : 2.0.1 zstandard : None tzdata : 2024.1 qtpy : None pyqt5 : None