SebKrantz / collapse

Advanced and Fast Data Transformation in R
https://sebkrantz.github.io/collapse/
Other
659 stars 35 forks source link

Should `collapse::fgroup_by` output a 'grouped_df'? #645

Open NicChr opened 1 month ago

NicChr commented 1 month ago

Hello, I noticed that the class of an object created by fgroup_by() includes "grouped_df". This causes issues for users that have pretty much any tidyverse package loaded as many dplyr functions have methods for this class that don't work well with "GRP_df".

It might be safer to remove the "grouped_df" subclass, making the class c("GRP_df", "data.frame") though I'm not sure how much work that would entail for functions that depend on "GRP_df".

Please see the example below.

library(dplyr)
#> Warning: package 'dplyr' was built under R version 4.4.1
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(collapse)
#> collapse 2.0.16, see ?`collapse-package` or ?`collapse-documentation`
#> 
#> Attaching package: 'collapse'
#> The following object is masked from 'package:stats':
#> 
#>     D
df <- fgroup_by(iris, Species)

names(df) <- names(iris)
#> Error in `group_data()`:
#> ! `.data` must be a valid <grouped_df> object.
#> Caused by error in `validate_grouped_df()`:
#> ! The `groups` attribute must be a data frame.
ss(df, 1:5)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          5.1         3.5          1.4         0.2  setosa
#> 2          4.9         3.0          1.4         0.2  setosa
#> 3          4.7         3.2          1.3         0.2  setosa
#> 4          4.6         3.1          1.5         0.2  setosa
#> 5          5.0         3.6          1.4         0.2  setosa
#> 
#> Grouped by:  Species  [3 | 50 (0)]

Created on 2024-10-19 with reprex v2.1.1

SebKrantz commented 1 month ago

Thanks, but this is by design, allowing collapse to also handle grouped data frames created with dplyr. I wasn't aware of the names issue and have never encountered it myself. In that case you may have to use `attr(iris, "names") <- ...".