Open david-waterworth opened 5 months ago
Hi,
I encountered the same problem. I noticed this issue has been open for over two months without any updates. Could you provide an update on the current status or any potential timeline for addressing this issue?
Thank you!
@matt-netiz Yes, it looks like this particular issue slipped through unfortunately.
If it helps, I believe upsample is essentially syntax sugar for a range explosion and a join:
(df.group_by("groups", maintain_order=True)
.agg(pl.datetime_range(pl.col("time").min(), pl.col("time").max(), interval="1d"))
.explode("time")
.join(df, on=["groups", "time"], how="left")
)
# shape: (5, 3)
# ┌────────┬─────────────────────┬────────┐
# │ groups ┆ time ┆ values │
# │ --- ┆ --- ┆ --- │
# │ str ┆ datetime[μs] ┆ i64 │
# ╞════════╪═════════════════════╪════════╡
# │ A ┆ 2021-02-01 00:00:00 ┆ 0 │
# │ A ┆ 2021-02-02 00:00:00 ┆ null │
# │ A ┆ 2021-02-03 00:00:00 ┆ 1 │
# │ A ┆ 2021-02-04 00:00:00 ┆ 2 │
# │ A ┆ 2021-02-05 00:00:00 ┆ 3 │
# └────────┴─────────────────────┴────────┘
I think the problem may be that the group_by
column isn't included in the join:
That solves my issue, thanks for the quick response!
Perhaps @MarcoGorelli can take a look and confirm when they are free.
thanks for the report - I think I agree, the groups should also be filled in
Is this a duplicate of https://github.com/pola-rs/polars/issues/14131 ?
Is this a duplicate of #14131 ?
i don't think so
Checks
Reproducible example
Log output
No response
Issue description
The time columns has been filled, but the groups column is null. I expected that the
group_by
column would be set to the value of the group, i.e. the following does what I want (i.e. I had to manually interpolate the group - I'm not 100% sure this works for edge cases though).Expected behavior
when you upsample over a group, the new records should include non-null values for both the
time_column
andgroup_by
column.Installed versions