Open antonioalegria opened 7 months ago
Can reproduce the error.
On 0.20.15
I get this:
df1 = pl.DataFrame({
"a": [1, 2, 3, 4, 5],
"b": [[{"a": 1}], [{"a": 1}, {"a": 2}], [{"a": 1}, {"a": 2}, {"a": 3}], [], None]
})
pl.__version__
df1.with_columns(
pl.when((pl.col("b").is_not_null()) & (pl.col("b").list.len() > 0))
.then(pl.col("b").list.to_struct("max_width", lambda x: f"{x}", 100))
)
# '0.20.15'
# shape: (5, 2)
# ┌─────┬────────────────────────┐
# │ a ┆ b │
# │ --- ┆ --- │
# │ i64 ┆ struct[3] │
# ╞═════╪════════════════════════╡
# │ 1 ┆ {{1},{null},{null}} │
# │ 2 ┆ {{1},{2},{null}} │
# │ 3 ┆ {{1},{2},{3}} │
# │ 4 ┆ {{null},{null},{null}} │
# │ 5 ┆ {{null},{null},{null}} │
# └─────┴────────────────────────┘
Does the .when()
actually do anything in this case?
df1.with_columns(
pl.col("b").list.to_struct("max_width", lambda x: f"{x}", 100)
)
# shape: (5, 2)
# ┌─────┬────────────────────────┐
# │ a ┆ b │
# │ --- ┆ --- │
# │ i64 ┆ struct[3] │
# ╞═════╪════════════════════════╡
# │ 1 ┆ {{1},{null},{null}} │
# │ 2 ┆ {{1},{2},{null}} │
# │ 3 ┆ {{1},{2},{3}} │
# │ 4 ┆ {{null},{null},{null}} │
# │ 5 ┆ {{null},{null},{null}} │
# └─────┴────────────────────────┘
Thanks @antonioalegria and @cmdlineluser. This should have been an issue for some time, but type_coercion
for when-then-otherwise
was changed to strict_cast
in 0.20.16
, the culprit was revealed then. But yes, we should fix this.
After some discussion, I think this should be fixed if we enable outer validity for StructChunked
, see #3462.
Until then, you may need to set type_coercion=False
to workaround.
Where should I set type_coercion=False?
Checks
Reproducible example
Log output
Issue description
In 0.20.15 this ran without any issues, now it raises this exception.
Expected behavior
It should run as in 0.20.15, unless I need to migrate some code, printing the following:
Installed versions