Closed billylanchantin closed 12 hours ago
Turns out this branch fixes more than I thought. main
is not encoding :nanosecond
datetimes correctly:
MIX_ENV=test iex -S mix # test env to get access to non-UTC timezones
iex> dt = %DateTime{year: 2017, month: 11, day: 7, zone_abbr: "CET",
...> hour: 11, minute: 45, second: 18, microsecond: {123456, 6},
...> utc_offset: 3600, std_offset: 0, time_zone: "Europe/Paris"}
#DateTime<2017-11-07 11:45:18.123456+01:00 CET Europe/Paris>
# This branch (correct)
iex> [dt] |> Explorer.Series.from_list(dtype: {:datetime, :nanosecond, dt.time_zone})
#Explorer.Series<
Polars[1]
datetime[ns, Europe/Paris] [2017-11-07 11:45:18.123456+01:00 CET Europe/Paris]
>
# main (way off)
iex> [dt] |> Explorer.Series.from_list(dtype: {:datetime, :nanosecond, dt.time_zone})
#Explorer.Series<
Polars[1]
datetime[ns, Europe/Paris] [1970-01-18 11:27:35.118123+01:00 CET Europe/Paris]
>
Fixes one of the #1014 issues with printing dataframes.
This one stems from how we encode datetimes. Example:
Panics with:
The culprit is how we encode datetimes: we always turn them into microseconds first. This is usually fine. But with certain operations like printing, especially when the datetime is far from the unix epoch, it can result in overflow.
My proposed fix is to build the
i64
representation directly from the(%DateTime{}, time_unit)
pair using (mostly) Polars functions. This works -- and I added some properties to be confident -- but it has the drawback of us no longer being able to supportimpl From<ExNaiveDateTime> for i64
since we need to know which time unit the user wants to represent the datetime with. We didn't use thatimpl
much so I think the drawback is acceptable.We're also handling the case where nanosecond precision datetimes can't be represented more explicitly.