Open stevenlis opened 3 months ago
Can reproduce. (Perhaps this was supposed to be labelled as a bug?)
It seems they're only broken in a group_by context?
Although in a "working" case, the return type is str
>>> df.select(pl.col('degree').min())
# shape: (1, 1)
# ┌────────┐
# │ degree │
# │ --- │
# │ str │ # <- ???
# ╞════════╡
# │ low │
# └────────┘
>>> df.group_by('id').agg(pl.col('degree').min())
# shape: (3, 2)
# ┌─────┬────────┐
# │ id ┆ degree │
# │ --- ┆ --- │
# │ str ┆ enum │
# ╞═════╪════════╡
# │ a ┆ null │
# │ c ┆ null │
# │ b ┆ null │
# └─────┴────────┘
After https://github.com/pola-rs/polars/issues/19269 the dtype is now correct in a select context.
df.select(pl.col('degree').min())
# shape: (1, 1)
# ┌────────┐
# │ degree │
# │ --- │
# │ enum │
# ╞════════╡
# │ low │
# └────────┘
The group_by still returns nulls.
df.group_by("id").min()
# shape: (3, 3)
# ┌─────┬────────┬───────────────┐
# │ id ┆ degree ┆ lowest_degree │
# │ --- ┆ --- ┆ --- │
# │ str ┆ enum ┆ enum │
# ╞═════╪════════╪═══════════════╡
# │ b ┆ null ┆ null │
# │ a ┆ null ┆ null │
# │ c ┆ null ┆ null │
# └─────┴────────┴───────────────┘
It seems this looks for agg_min()
/ agg_max()
functions which don't seem to be implemented for CategoricalChunked?
https://github.com/pola-rs/polars/tree/main/crates/polars-core/src/series/implementations
Description
As of polars 1.5.0
expecting:
At this point, one has to use
.to_physical()
for comparison, Btw, Some expressions such as.first()
would work.