pola-rs / polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust
https://docs.pola.rs
Other
29.59k stars 1.89k forks source link

`.struct.field` + `.filter` PanicException instead of ColumnNotFoundError #18787

Open cmdlineluser opened 2 weeks ago

cmdlineluser commented 2 weeks ago

Checks

Reproducible example

import polars as pl

df = pl.DataFrame({"a": [1], "b": [2]})

(df.select(pl.struct(pl.all()))
   .select(pl.first().struct.field("a", "b").filter(pl.col("foo") == 1))
)

# thread '<unnamed>' panicked at crates/polars-plan/src/utils.rs:360:79:
# called `Result::unwrap()` on an `Err` value: ColumnNotFound(ErrString("foo"))
# PanicException: called `Result::unwrap()` on an `Err` value: ColumnNotFound(ErrString("foo"))

Log output

No response

Issue description

It only happens when selecting multiple fields.

With a single field, the error is raised as expected:

(df.select(pl.struct(pl.all()))
   .select(pl.first().struct.field("a").filter(pl.col("foo") == 1))
)
# ColumnNotFoundError: "foo" not found

Expected behavior

No panic.

Installed versions

``` --------Version info--------- Polars: 1.7.1 Index type: UInt32 Platform: macOS-13.6.1-arm64-arm-64bit Python: 3.12.2 (main, Feb 6 2024, 20:19:44) [Clang 15.0.0 (clang-1500.1.0.2.5)] ----Optional dependencies---- adbc_driver_manager: cloudpickle: connectorx: deltalake: fastexcel: fsspec: gevent: great_tables: hvplot: matplotlib: nest_asyncio: numpy: 1.26.4 openpyxl: pandas: 2.2.1 pyarrow: 15.0.2 pydantic: pyiceberg: sqlalchemy: torch: xlsx2csv: xlsxwriter: ```
ritchie46 commented 1 week ago

This should be validated at IR conversion. @coastalwhite is picking this up IIRC.