The unknown type leads to an exception in the following rolling + group_by aggregation:
frame = pl.from_dict({
"date": ["2001-01-01", "2001-01-02", "2001-01-03"] * 2,
"group": ["A"] * 3 + ["B"] * 3,
"value": [1.0, 2.0, 3.0, 4.0, 5.0, 6.0]
}).with_columns(pl.col("date").str.to_date())
result = frame.lazy().rolling(
index_column="date",
group_by="group",
period="2d",
).agg([
pl.when(pl.col("value").is_not_null().all()).then(np.expm1(pl.col("value").log1p().sum())).alias(f"{agg}")
for agg in ["foo", "bar", "egg"]
])
result.collect_schema()
>>>thread '' panicked at py-polars/src/conversion/mod.rs:241:39:
called `Result::unwrap()` on an `Err` value: PyErr { type: <class 'TypeError'>, value: TypeError("cannot parse input of type 'Unknown' into Polars data type: Unknown"), traceback: Some(<traceback object at 0x175f19840>) }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Traceback (most recent call last):
File ".../frame.py", line 118, in <module>
result.collect_schema()
File ".../lib/python3.10/site-packages/polars/lazyframe/frame.py", line 2146, in collect_schema
return Schema(self._ldf.collect_schema())
pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: PyErr { type: <class 'TypeError'>, value: TypeError("cannot parse input of type 'Unknown' into Polars data type: Unknown"), traceback: Some(<traceback object at 0x175f19840>) }
When I remove the conditional from the aggregation then the error is gone, i.e. the following works
result = frame.lazy().rolling(
index_column="date",
group_by="group",
period="2d",
).agg([
np.expm1(pl.col("value").log1p().sum()).alias(f"{agg}")
for agg in ["foo", "bar", "egg"]
])
result.collect_schema()
So the absence of type information in collect_schema() when using a numpy ufunc looks relatively benign at first glance, but in the above example, it leads to an exception.
I wasn't able to further simplify the example. I'd be glad if someone more knowledgable could look into it.
Thanks a lot for the great library and congratulations to the 1.0.0 release.
Checks
Reproducible example
Log output
No response
Issue description
The
unknown
type leads to an exception in the followingrolling
+group_by
aggregation:When I remove the conditional from the aggregation then the error is gone, i.e. the following works
So the absence of type information in
collect_schema()
when using anumpy
ufunc looks relatively benign at first glance, but in the above example, it leads to an exception.I wasn't able to further simplify the example. I'd be glad if someone more knowledgable could look into it.
Thanks a lot for the great library and congratulations to the
1.0.0
release.Expected behavior
I'd have expected to get
Float64
in both cases.Installed versions