databricks / koalas

Koalas: pandas API on Apache Spark
Apache License 2.0
3.33k stars 356 forks source link

Timezone-aware datetimes are no longer supported #2175

Open ashwin153 opened 3 years ago

ashwin153 commented 3 years ago

I am trying to upgrade from 1.5.0 to 1.8.0, and my unit tests are failing because of a "TypeError: Type datetime64[ns, Timezone('UTC')] was not understood." on a non-index column. Here's a quick reproduction. This works on 1.5.0 and fails on 1.6.0 and 1.8.0. (pyspark 3.0.2 and pandas 1.1.5)

from databricks import koalas
import datetime
import pandas

df = pandas.DataFrame({
    "time": [datetime.datetime.now(tz=datetime.timezone.utc)],
})

koalas.from_pandas(df)
# TypeError: Type datetime64[ns, UTC] was not understood.

Originally posted by @ashwin153 in https://github.com/databricks/koalas/issues/2102#issuecomment-864146498

ashwin153 commented 3 years ago

IMO this feels more like a regression than an enhancement, since this worked in previous versions.

ashwin153 commented 3 years ago

Any update on this?

HyukjinKwon commented 3 years ago

The previous implementation was not really correct. So this is disabled for now. To mimic the previous behaviour, you can manually localize it by converting UTC to your local timezone (without timezone), and use it for now.