Open lucianolorenti opened 3 weeks ago
Can reproduce.
import polars as pl
import polars.selectors as cs
df = pl.LazyFrame({"A": [1]})
df.select(
pl.when(pl.col.A < 1).then(pl.col.A).otherwise(2) # <- OK: int / int
).select(cs.numeric()).collect_schema()
# Schema([('A', Int64)])
df.select(
pl.when(pl.col.A < 1).then(pl.col.A).otherwise(2.0) # <- NOT OK: int / float
).select(cs.numeric()).collect_schema()
# Schema()
The Schema says it's a Float64:
df.select(
pl.when(pl.col.A < 1).then(pl.col.A).otherwise(2.0)
).collect_schema()
# Schema([('A', Float64)])
But even a regular dtype selection does not work:
df.select(
pl.when(pl.col.A < 1).then(pl.col.A).otherwise(2.0)
).select(pl.col(pl.Float64)).collect()
# shape: (0, 0)
# ┌┐
# ╞╡
# └┘
Ah, selectors don't recognize dynamic numerics I think.
Checks
Reproducible example
Log output
No response
Issue description
After performing a column transformation on a lazy DataFrame, when I selected the numeric columns, the number of columns returned is not the same when using LazyDataframe with respect to eaget dataframe. I expect having 2 numeric columns in both cases.
Its seems that the column B is not guessed as numeric according to the plan.
Expected behavior
After performing the column transformation, I was expecting to obtain the same columns both in eager and lazy DataFrames.
Maybe the current one is the expected behaviour, but I am not sure. I tried to look for a similar report but without any luck.
Installed versions