Open mcrumiller opened 2 weeks ago
thanks for reporting - the validation currently happens later:
In [9]: pl.Series([datetime(2020, 1, 1)], dtype=pl.Datetime(time_zone='cabbage'))
---------------------------------------------------------------------------
ComputeError Traceback (most recent call last)
Cell In[9], line 1
----> 1 pl.Series([datetime(2020, 1, 1)], dtype=pl.Datetime(time_zone='cabbage'))
File ~/scratch/.venv/lib/python3.11/site-packages/polars/series/series.py:312, in Series.__init__(self, name, values, dtype, strict, nan_to_null, dtype_if_empty)
309 raise TypeError(msg)
311 if isinstance(values, Sequence):
--> 312 self._s = sequence_to_pyseries(
313 name,
314 values,
315 dtype=dtype,
316 strict=strict,
317 nan_to_null=nan_to_null,
318 )
320 elif values is None:
321 self._s = sequence_to_pyseries(name, [], dtype=dtype)
File ~/scratch/.venv/lib/python3.11/site-packages/polars/_utils/construction/series.py:235, in sequence_to_pyseries(name, values, dtype, strict, nan_to_null)
225 if values_tz != "UTC" and dtype_tz is None:
226 warnings.warn(
227 "Constructing a Series with time-zone-aware "
228 "datetimes results in a Series with UTC time zone. "
(...)
233 stacklevel=find_stacklevel(),
234 )
--> 235 return s.dt.replace_time_zone(dtype_tz or "UTC")._s
236 return s._s
238 elif (
239 _check_for_numpy(value)
240 and isinstance(value, np.ndarray)
241 and len(value.shape) == 1
242 ):
File ~/scratch/.venv/lib/python3.11/site-packages/polars/series/utils.py:107, in call_expr.<locals>.wrapper(self, *args, **kwargs)
105 expr = getattr(expr, namespace)
106 f = getattr(expr, func.__name__)
--> 107 return s.to_frame().select_seq(f(*args, **kwargs)).to_series()
File ~/scratch/.venv/lib/python3.11/site-packages/polars/dataframe/frame.py:7906, in DataFrame.select_seq(self, *exprs, **named_exprs)
7883 def select_seq(
7884 self, *exprs: IntoExpr | Iterable[IntoExpr], **named_exprs: IntoExpr
7885 ) -> DataFrame:
7886 """
7887 Select columns from this DataFrame.
7888
(...)
7904 select
7905 """
-> 7906 return self.lazy().select_seq(*exprs, **named_exprs).collect(_eager=True)
File ~/scratch/.venv/lib/python3.11/site-packages/polars/lazyframe/frame.py:1810, in LazyFrame.collect(self, type_coercion, predicate_pushdown, projection_pushdown, simplify_expression, slice_pushdown, comm_subplan_elim, comm_subexpr_elim, no_optimization, streaming, background, _eager)
1807 if background:
1808 return InProcessQuery(ldf.collect_concurrently())
-> 1810 return wrap_df(ldf.collect())
ComputeError: unable to parse time zone: 'cabbage'. Please check the Time Zone Database for a list of available time zones
Hi @MarcoGorelli, should we add some checks here
Similar to what's done here
Or should we leave the error as it is? Do you have any thoughts on this? If possible, I'd like to improve it.
Using Pyright/MyPy, you'll get a static type-checking warning if you try and pass in object()
to time_zone
here. The only way this could perhaps be built on (from a type-checking perspective), is maintaining a literal of all possible timezones, and using that instead of str
.
Checks
Issue Description
The time zone parameter in
pl.Datetime
can be an invalid string, and can even be any type of Python object. We should probably ensure that the time zone is a valid time zone when constructing apl.Datetime
datatype.Installed versions