pola-rs / polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust
https://docs.pola.rs
Other
26.63k stars 1.63k forks source link

exception thrown if converting arrow Table with struct and dictionary columns to polar dataframe #16024

Closed reductionnist closed 2 weeks ago

reductionnist commented 2 weeks ago

Hi, pl.from_arrow() will throw a polars.exceptions.ColumnNotFoundError if an arrow table contains both dictionary and struct columns. This appears to be due to the logic in arrow_to_pydf which will omit the dictionary columns from the df being constructed if there are any struct columns. If that interpretation is true, then something like

if len(dictionary_cols) > 0 or len(struct_cols) > 0:
        df = wrap_df(pydf)
        df = df.with_columns([F.lit(s).alias(s.name) for s in itertools.chain(dictionary_cols.values(), struct_cols.values())])
        reset_order = True

may fix the issue.