Open dalejung opened 1 year ago
cc @jorisvandenbossche suggested recently changing find_common_type behavior here. i think id be OK with that.
+1 on using find_common_type
or similar function. One complication would be that that function would need to figure out if the timestamps in the less precise datetime array fits within the max/min time limits of the more precise datetime array
This also occurs with mismatched timezones at the same resolution. Is that the same bug?
That is intentional, but there has been discussion of changing/deprecating it to become UTC
I think that behavior is new, perhaps it should raise a warning?
Discussed a couple dev calls ago, agreed to have this cast to the higher resolution. I dont recall if we said the change needed a deprecation cycle.
Just stumbled across this issue. Totally agree on casting to the higher resolution, and I would strongly advocate treating this as a BUG and not running a deprecation cycle.
Principle of least surprise is that concatenating dates results in dates. The fact that one source is (say) ms and the other ns resolution does not make dtype conversion to object a not-surprising piece of behaviour.
Note that this bug also occurs if concatenating data from a single timezone, at a constant resolution, if that timezone has multiple offsets from UTC (ie. daylight savings occurred during the series' timespan).
I think that brings the sum addressable issues to: BUG
INTENTIONAL (but may change)
TL;DR: I think this is fixed in 2.2.0.
I came across this issue while searching for the issue I was having (https://github.com/pandas-dev/pandas/issues/53640). It appears the fix for that issue in 2.2.0 (https://github.com/pandas-dev/pandas/pull/53641) has also resolved this issue as I can't recreate it in 2.2.0, while I can in 2.1.4.
Pandas version checks
[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pandas.
[X] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
Combining datetime64 columns with different resolutions returns an object dtype. This has been around since 2.0 but it just recently triggered for me because of the
df['datecol'] = ts
now using theTimestamp
resolution to set the column dtype instead of defaulting tons
.Expected Behavior
Not 100% sure. I would assume the combined dtype should be the most precise of the provided datetime64 resolutions.
Installed Versions