Open rhshadrach-8451 opened 1 week ago
Can reproduce.
It seems it may be rechunk related - forcing a rechunk in the concat (or using .rechunk().filter()
) avoids the panic.
df1 = pl.DataFrame({"A": ["x", "x"]})
df2 = pl.DataFrame({"A": ["y"]})
df = pl.concat([df1, df2], rechunk=True).with_columns(
B=pl.col("A").min().over("A"),
C=0,
)
df.filter(~pl.col("A").eq("y") | ~pl.col("A").is_in(['y'])).filter(pl.col("A").gt(pl.lit("0")))
# shape: (2, 3)
# ┌─────┬─────┬─────┐
# │ A ┆ B ┆ C │
# │ --- ┆ --- ┆ --- │
# │ str ┆ str ┆ i32 │
# ╞═════╪═════╪═════╡
# │ x ┆ x ┆ 0 │
# │ x ┆ x ┆ 0 │
# └─────┴─────┴─────┘
I forgot to mention this worked fine in 1.12.0; I've updated the OP.
Checks
Reproducible example
Log output
Issue description
This first appeared in 1.13.0. 1.12.0 gives the expected result.
The above code is from a heavily reduced computation, so appears nonsensical. I believe it to be minimal in that the correct result appears if I remove any one of the following elements:
df1
on L1.pl.DataFrame({"A": ["x", "x", "y"]})
instead ofdf1
anddf2
.B
.C
..is_in(['y'])
with.eq('y')
on L7.filter
on L7.filter
on L7.filter
into a single call tofilter
on L7.The nature of this looks similar to https://github.com/pola-rs/polars/issues/16830
Expected behavior
Installed versions