pola-rs / polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust
https://docs.pola.rs
Other
27.25k stars 1.67k forks source link

`struct.with_fields` #16082

Closed KDruzhkin closed 1 month ago

KDruzhkin commented 1 month ago

Description

TL;DR

DataFrames / LazyFrames have a very convenient method with_columns. It would be nice for Structs to have a similar method, e.g. with_fields.

Motivation

When working with top-level columns, we can use with_columns to transform just one column, leaving everything else around intact.

However, when working with nested structures (structs inside lists, or lists inside structs), we have to manually keep the whole structure from falling apart.

from polars import DataFrame, col, struct, element

nested_df = DataFrame(
    {
        "coords": [
            [{"x": 1, "y": 4}, {"x": 4, "y": 9}, {"x": 9, "y": 16}],
            [{"x": 4, "y": 25}, {"x": 25, "y": 36}, {"x": 36, "y": 49}],
        ]
    }
)

nested_df.with_columns(
    col("coords").list.eval(
        struct(
            element().struct.field("x").sqrt(),  # Here is the transformation I want.
            element().struct.field("y"),  # Here is the context I have to manually restore.
        )
    )
)
ritchie46 commented 1 month ago

Added in #16305