syvwlch / Data-Ignota

A data-driven exploration of Ada Palmer's Terra Ignota series
https://syvwlch.github.io/Data-Ignota/
MIT License
3 stars 0 forks source link

[Clarification] Should Ungroup after Summarize? #44

Closed syvwlch closed 2 years ago

syvwlch commented 2 years ago

What needs to be clarified

Read up on best practice with group + summarize. Apparently it's a good idea to ungroup at the end?

syvwlch commented 2 years ago

Maybe also check if a second group_by after a summarize is needed?

syvwlch commented 2 years ago

It is best practice to ungroup after group to avoid errors, e.g. when trying to mutate one of the grouped columns later.

Note that summarize drops one level of groups so it eliminates the need for ungroup after a group by a single column.

syvwlch commented 2 years ago

Similarly, should not be necessary to regroup after a summarize if the column is still there, but I will test to confirm.

syvwlch commented 2 years ago

Ok, tried it with the post I'm currently working on and yeah, every time the summarize step dropped the groups and there is no need to ungroup.

On the other hand, this means that it is necessary to group again after a summarize.

syvwlch commented 2 years ago

LOL

I'm slow sometimes. The reason my code was always dropping all the groups was because I was telling it to. Funny how that works.

Current version of summarize() sends a warning to the console when you don't specify a .groups argument. This prompted me to add a .groups = "drop" argument which, funnily enough, tells it to drop all groups levels. This is relatively new, and so a lot of the resources I was using did not mention this.

TL;DR: it is best practice to drop groups when done with them, and the current way to do this is via the .groups argument which is what I've been doing. Yay for warning, boo for old training resources.