Open github-actions[bot] opened 4 years ago
Actually, it is only by accident that check.names does anything in a transform() call - it is not a documented argument, so only "works" because transform() internally calls data.frame().
This in turn means that certain variable names are not usable (check.names, fix.empty.names, etc.) because they have a special meaning.
There is an internal split between "new" and "old" variables, specifically
if (any(matched)) {
`_data`[inx[matched]] <- e[matched]
`_data` <- data.frame(`_data`)
}
and the call to data.frame() here is not going to see check.names= so will clean up names.
This is very old code and data frame semantics have changed considerably in the meantime. In particular, I don't know whether the call to data.frame is still needed. The code suggests that the previous assignment might have turned _data into something that wasn't a data frame. Similarly, the split between matched and non-matched computed values might not be required anymore, since it now actually works to do things like
airquality[c("Ozone", "foo", "bar")] <- list(Ozone=log(airquality$Ozone), foo=1, bar=2)
with recycling and all.
It might be worth rewriting transform() at some point, but it is an old tool, mainly used interactively, and you are probably better off using within():
names(within(df, {B=B+5; `C-3`=21:22}))
[1] "A-1" "B" "C-3"
NA
Yes, within() is nicer about names than transform(). We noticed this issue because at least one CRAN package relied on passing check.names=FALSE to transform:
./rapportools/R/tables.R: tbl <- transform(tbl, %
= N / base::sum(N) * 100, check.names = FALSE) # add percentage
./rapportools/R/tables.R: tbl <- transform(tbl, Cumul. N
= cumsum(N), Cumul. %
= cumsum(%
), check.names = FALSE) # add cumulatives
NA
I don't think the effect of check.names=FALSE should depend on whether transform() replaces a column in or just adds columns to the data.frame.
df <- data.frame(check.names=FALSE,
A-1
=11:12, B=21:22) names(df)[1] "A-1" "B"
names(transform(df, check.names=FALSE, B=B+5,
C-3
=21:22))[1] "A.1" "B" "C-3"
names(transform(df, check.names=FALSE,
C-3
=21:22))[1] "A-1" "B" "C-3"
R version 4.0.2 (2020-06-22) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 16299)
Matrix products: default
locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages: [1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached): [1] compiler_4.0.2
METADATA