Open DrMaphuse opened 1 year ago
It's because you're using lambda
in a loop.
value
ends up being bound to the same for all cases (in this case pears
)
You can pass it as a named arg as shown in the faq example:
cols = ["a", "b"]
data = pl.DataFrame([["many"], ["no"]], schema=cols).lazy()
values_to_append = {
"a": 'apples',
"b": "pears",
}
for col, value in values_to_append.items():
data = data.with_columns(
pl.col(col).map_elements(lambda x, value=value: f"{x}_{value}").alias(col),
)
print(data.collect())
# shape: (1, 2)
# ┌─────────────┬──────────┐
# │ a ┆ b │
# │ --- ┆ --- │
# │ str ┆ str │
# ╞═════════════╪══════════╡
# │ many_apples ┆ no_pears │
# └─────────────┴──────────┘
Interesting. Since lambda is instantly called when using eager mode, this problem does not appear then. This seems to have been different in previous versions of polars, since the code used to work until a few versions ago.
Checks
[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of Polars.
Reproducible example
Note: I know this can be done in a million other and better ways. My actual problem looks different, this example is just the simplest way I could think of to show what the issue is.
Log output
No response
Issue description
When using the map_elements function in lazy mode, the columns seem to get mixed up.
Expected behavior
The expected result is a DataFrame where “apples” is appended to all elements in column “a” and “pears” is appended to all elements in column “b”. Instead, "pears" is appended in both columns.
Installed versions