thomasp85 / tidygraph

A tidy API for graph manipulation
https://tidygraph.data-imaginist.com
Other
547 stars 62 forks source link

.by surprisingly doesn't work in mutate() #194

Open hughjonesd opened 5 months ago

hughjonesd commented 5 months ago

In dplyr 1.1.0, the following groups data before performing a mutate:

iris |> mutate(.by = Species, n_in_species = n())

I expected the same to work in tidygraph, but in fact it works like the old interface:

library(tidygraph)
data(whigs, package = "ggraph")
whigs <- as_tbl_graph(whigs) |> activate(nodes) |> mutate(AM = grepl("^[A-M]", name))

whigs |> mutate(.by = AM,
  n_in_group = n()
)
...
   type  name               AM    .by   n_in_group
   <lgl> <chr>              <lgl> <lgl>      <int>
 1 FALSE John Adams         TRUE  TRUE         261
 2 FALSE Samuel Adams       FALSE FALSE        261

where the numbers should be more like:

whigs |> 
  group_by(AM) |> 
  mutate(.by = AM,
    n_in_group = n()
  ) |> 
  ungroup()
...
   type  name               AM    .by   n_in_group
   <lgl> <chr>              <lgl> <lgl>      <int>
 1 FALSE John Adams         TRUE  TRUE         152
 2 FALSE Samuel Adams       FALSE FALSE        109

This is a silent failure. Until dplyr .by is supported, it might be worth just throwing a warning when .by is used in a dplyr verb.