tidyverts / tsibble

Tidy Temporal Data Frames and Tools
https://tsibble.tidyverts.org
GNU General Public License v3.0
530 stars 49 forks source link

dplry::summarise leaves grouped_ts when dplyr::group_by'ing over multiple variables #274

Open psarka opened 2 years ago

psarka commented 2 years ago

Consider the example from the tsibble index_by reference and it's modified version:

tourism %>%
  index_by(Year = ~ year(.)) %>%
  group_by(Region, State) %>%
  summarise(Total = sum(Trips))
# A tsibble: 1,520 x 4 [1Y]
# Key:       Region, State [76]
# Groups:    Region [76]
   Region   State            Year Total
   <chr>    <chr>           <dbl> <dbl>
 1 Adelaide South Australia  1998 2226.
 2 Adelaide South Australia  1999 2218.
 3 Adelaide South Australia  2000 2418.
> tourism %>%
+   index_by(Year = ~ year(.)) %>%
+   group_by(Region) %>%
+   summarise(Total = sum(Trips))
# A tsibble: 1,520 x 3 [1Y]
# Key:       Region [76]
   Region    Year Total
   <chr>    <dbl> <dbl>
 1 Adelaide  1998 2226.
 2 Adelaide  1999 2218.
 3 Adelaide  2000 2418.

In the first case:

In the second case:

This is not very intuitive, as summary function seems to be "dropping" 2 (?) grouping variables. I expected it to either drop only Year, resulting in grouped_ts in both cases, or to "drop" all variables, resulting in tbs_ts in both cases. Not sure if this is a bug, but even if it isn't, I would have liked to have it mentioned somewhere (together with dplyr::ungroup which I guess I need to use in the first case).