Closed kdpsingh closed 1 year ago
As of 2e3b5cbb93943b98fd98da58fb1721be7d0e7280, @group_by
accepts tidy expressions as in the example above.
I need to better understand how ungrouping works in DataFrames.jl and which operations remove the grouping versus which ones do not.
There is an ungroup
keyword argument that allows you to choose if you want to ungroup or not in every operation. By default it is true
.
Awesome, that's very straightforward. I will incorporate this in the next version.
One note I'll leave here in case someone else decides to work on this issue is that summarize()
should remove one "layer" of grouping if grouped by multiple columns, whereas other functions should leave the data grouped as-is.
I'll open a separate issue to add .by
in the future. Since this is a relatively new feature, it's not in widespread use just yet. Otherwise, the items in this list are completed.
There are several enhancements needed to
@group_by()
, some of which depend on first addressing #8:[x] Implementing
@ungroup()
[x] The ability to define new columns inside of a
@group_by()
, such as@group_by(avg_sbp = (sbp1 + sbp2) / 2)
.This would be implemented by first parsing the expressions using
parse_r()
from #8, runningtransform()
to create the new columns, and then runninggroupby()
.[x] Mimicking the ungrouping behavior of
dplyr
. Indplyr
, the final layer of grouping is automatically removed after eachsummarize()
operation but not after any other operation. I need to better understand how ungrouping works in DataFrames.jl and which operations remove the grouping versus which ones do not. If all operations remove grouping, then we could manually regroup usingdplyr
rules.[ ] Add
.by
parameter to other macros to allow for in-line grouping