nathaneastwood / poorman

A poor man's dependency free grammar of data manipulation
https://nathaneastwood.github.io/poorman/
Other
338 stars 15 forks source link

Passing expressions as columns to `pivot_longer` fails with error. #118

Open barryrowlingson opened 1 year ago

barryrowlingson commented 1 year ago

Describe the bug

Running pivot_longer with column names specified in a vector with c("Y1","Y2") works fine, but if an expression returns that vector it fails.

To Reproduce

Sample data:

d = data.frame(X=1:5, Y1=runif(5), Y2=runif(5))

I want to pivot longer on all vars except the first one. The names are therefore:

names(d)[-1]
[1] "Y1" "Y2"
nd = names(d)[-1]

and this works

pivot_longer(d, nd)
   X name     value
1  1   Y1 0.9821147
2  1   Y2 0.8756433
3  2   Y1 0.9908365

but this doesn't:

> pivot_longer(d, names(d)[-1])
Error in (function (.data, ..., .group_pos = FALSE)  : 
  Locations Y1 and Y2 don't exist. There are only 3 columns.

I could simply do pivot_longer(d, -1) but there are maybe other contexts where an expression might get passed to cols, eg some function that returns the columns:

> getY = function(){c("Y1","Y2")}
> pivot_longer(d, getY())
Error in (function (.data, ..., .group_pos = FALSE)  : 
  Locations Y1 and Y2 don't exist. There are only 4 columns.

Expected behavior The two pivot_longer calls should return the same data frame.

System Information: Please detail the following information

nathaneastwood commented 1 year ago

@etiennebacher is this an issue in the new version?

etiennebacher commented 1 year ago

This is not related to pivot_longer(), it's due to eval_select_pos(data, substitute(cols)). Here's an example showing the same error with rename_with():

suppressPackageStartupMessages(library(poorman))

d = data.frame(X=1:5, Y1=runif(5), Y2=runif(5))

nd = names(d)[-1]

rename_with(d, toupper, nd)
#>   X         Y1        Y2
#> 1 1 0.89690835 0.7336885
#> 2 2 0.60126522 0.8331964
#> 3 3 0.03632217 0.9804623
#> 4 4 0.09652256 0.2674726
#> 5 5 0.51158470 0.5940056

rename_with(d, toupper, names(d)[-1])
#> Error in (function (.data, ..., .group_pos = FALSE) : Locations Y1 and Y2 don't exist. There are only 3 columns.

Created on 2022-11-02 with reprex v2.0.2

etiennebacher commented 1 year ago

Can be fixed by adding

  if (is.character(pos)) {
    pos <- which(data_names %in% pos)
  }

just after this line: https://github.com/nathaneastwood/poorman/blob/e910fd7c67f309d19003278d815aadff785007e9/R/select_positions.R#L38

nathaneastwood commented 1 year ago

Looks like an env issue with eval_select_pos() then. I probably need to capture and evaluate in the correct environment. Possibly

https://github.com/nathaneastwood/poorman/blob/master/R/select_positions.R#L183-L185

etiennebacher commented 1 year ago

Maybe related:

suppressPackageStartupMessages(library(poorman))
iris <- head(iris, n = 1)

# Works
for (i in names(iris)) {
  print(select(iris, all_of(i)))
}
#>   Sepal.Length
#> 1          5.1
#>   Sepal.Width
#> 1         3.5
#>   Petal.Length
#> 1          1.4
#>   Petal.Width
#> 1         0.2
#>   Species
#> 1  setosa

# Doesn't work
lapply(names(iris), function(x) {
  select(iris, all_of(x))
})
#> Error in x %in% vars: object 'x' not found

Created on 2022-11-21 with reprex v2.0.2