Mutating grouped variables using the .by argument returns a dataframe that is not exactly equal to the group() |> mutate() |> ungroup() method. This is consequential when using tests such as testthat::expect_known_hash() which returns a different value for the two patterns.
Specifically, using .by seems to return a slightly malformed dataframe (attributes are in an arbitrary wrong order), which is corrected when run through tibble(). $class seems to be either second or third when it should be the first attribute.
library(tidyverse)
# Dplyr 1.1.4
iris_mutated_with_by <- iris |>
mutate(.by = Species)
iris_mutated <- iris |>
group_by(Species) |>
mutate() |>
ungroup()
# FALSE, but should be true.
identical(iris_mutated, iris_mutated_with_by, attrib.as.set = FALSE)
# These return different hashes as a result.
iris_mutated |> testthat::expect_known_hash("679a41dc2c")
iris_mutated_with_by |> testthat::expect_known_hash("d3c5d07100")
# Running the .by mutated dataframe through tibble() cleans things up:
iris_mutated_with_by |> tibble() |> testthat::expect_known_hash("679a41dc2c")
I would argue that expect_known_hash() is the wrong thing to use here. There is no guarantee on attribute ordering, only that they exist, and I don't think you should rely on them being in a specific order
Mutating grouped variables using the
.by
argument returns a dataframe that is not exactly equal to thegroup() |> mutate() |> ungroup()
method. This is consequential when using tests such astestthat::expect_known_hash()
which returns a different value for the two patterns.Specifically, using
.by
seems to return a slightly malformed dataframe (attributes are in an arbitrary wrong order), which is corrected when run throughtibble()
.$class
seems to be either second or third when it should be the first attribute.