Closed mislav0207 closed 7 years ago
Please create a reprex using the reprex package, as described in the issue template.
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
df <- structure(list(subjecttaxnoid = c("22661187010", "10346575807",
"22439110996", "63510438612", "85267957976", "40178118040", "51246665873",
"66803849969", "45813719599", "26979059418", "11240408751"),
reportyear = c(2014L, 2014L, 2014L, 2008L, 2008L, 2008L,
2008L, 2013L, 2013L, 2013L, 2013L), b001 = c(0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0), b002 = c(0, 3.43884233571018e-07, 7.24705810574303e-08,
1.41222784374111e-07, 1.62917712565032e-05, 0, 4.53310814208705e-07,
7.63856039195011e-06, 0, 0, 0)), .Names = c("subjecttaxnoid",
"reportyear", "b001", "b002"), row.names = c(1L, 2L, 3L, 200000L,
200001L, 200002L, 200003L, 40000L, 40001L, 40002L, 40003L), class = "data.frame")
x <- c("b001", "b002")
my_list <- list()
for (i in 1:length(x)){
my_list[[1]] <- df %>% group_by(reportyear) %>% top_n(2, wt = x[1])
}
#> Error in eval(substitute(expr), envir, enclos): Unsupported use of matrix or array for column indexing
I suppose, this what you mean by "creating reprex". Sorry, I have never done this before :)
That's a great first step. The next step is to make the reprex as small as possible so I can understand it more easily. For example, you could make the data frame simpler, and create it with data.frame()
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
df <- data.frame(reportyear = c(2014L, 2014L, 2014L, 2013L, 2013L, 2013L),
b001 = c(10:15),
b002 = c(1:6))
x <- c("b001", "b002")
my_list <- list()
for (i in 1:length(x)){
my_list[[i]] <- df %>% group_by(reportyear) %>% top_n(2, wt = x[1])
}
#> Error in eval(substitute(expr), envir, enclos): Unsupported use of matrix or array for column indexing
Better?
If I understood this well, @mislav0207 wants to give to top_n
a column name as a character string, but top_n
currently expects a bare column name. I can workaround the issue by creating a temporary column, but there must be more elegant solutions. Here is the workaround for the for
loop.
my_list <- lapply(x, function(col) {
df$tempcol <- df[[col]]
df %>% group_by(reportyear) %>% top_n(2, wt = tempcol) %>% select(-tempcol)
})
With the output:
> my_list
[[1]]
Source: local data frame [4 x 3]
Groups: reportyear [2]
reportyear b001 b002
<int> <int> <int>
1 2014 11 2
2 2014 12 3
3 2013 14 5
4 2013 15 6
[[2]]
Source: local data frame [4 x 3]
Groups: reportyear [2]
reportyear b001 b002
<int> <int> <int>
1 2014 11 2
2 2014 12 3
3 2013 14 5
4 2013 15 6
@lionel-: Is there a nicer way to do this in the new tidyeval framework?
yes I've ported all these functions to tidyeval in a branch yet to be pushed.
You'll do it like this:
my_list <- list()
for (i in 1:length(x)){
my_list[[1]] <- df %>% group_by(reportyear) %>% top_n(2, wt = !! sym(x[1]))
}
Lets I have data frame like this:
and the vector that containt names of two columns of df:
x <- c("b001", "b002")
I would like to use components of x instead of columns names in dplyr:
This returns an error:
Could you please help with this issue?