Closed dcaldwellphd closed 1 year ago
Hi David,
It looks like a programming error where your code is iterating over columns of the data frame (whose name is ‘data’) rather than iterating over nested data frames. So I think the error message arises because you’re asking it to turn a single column of the data frame into a survey design object. Hope that helps!
Ben
Thanks for this, Ben.
I'm not sure that is the issue. I get the same error if I give the column containing nested data frames a different key, like this:
...
nested_distr <- input |>
nest_by(across(any_of(by)), .key = "nested_data") |>
mutate(
design_list = map(
nested_data,
as_survey_design,
ids = {{ ids }},
probs = {{ probs }},
strata = {{ strata }},
fpc = {{ fpc }},
weights = {{ weights }},
nest = nest
)
)
...
Or have I completely missed your point?
Thanks again for your help with this!
David
You might find it helpful to use the debug()
function, which will show that in the code above, map()
is operating over columns of a dataframe. So it's trying to use as_survey_design(.data = x)
, where x
is a numeric vector rather than a data frame.
Here's a helpful video on how to use debug()
within RStudio.
https://www.youtube.com/watch?v=QQxTf3o07NU&ab_channel=FreddyDrennan
Here's what I see when I use debug(as_survey_design)
and then try to run the code you provided:
debug(as_survey_design)
out <- toydata |>
myfunc(
value = att_val,
by = "att_name",
weights = weight
)
# debugging in: .f(.x[[i]], ...)
# debug: {
# UseMethod("as_survey_design")
# }
# Browse[2]> class(.data)
# "numeric"
# Browse[2]> head(.data)
# [1] 3 NA 6 5 3 10
This shows us that map()
is iterating over columns of a data frame, and using a given column for the .data
argument of as_survey_design()
, which is why as_survey_design()
is throwing that error message: it's being given a numeric vector instead of a data frame.
I suspect the confusion arises because nest_by()
returns a rowwise data frame, which is supposed to automatically vectorize functions over rows of the data frame (see: https://dplyr.tidyverse.org/articles/rowwise.html). You can either use the special vectorization caused by working with a rowwise data frame, or you can call ungroup()
on your data frame (i.e., convert it from a rowwise data frame to a regular data frame) and then use map()
to iterate over rows of your data frame.
library(srvyr)
library(dplyr)
library(purrr)
toydata <- tibble::tibble(
att5val = sample(c(seq(1, 5, 1), NA), 1000, replace = TRUE),
att10val = sample(c(seq(1, 10, 1), NA), 1000, replace = TRUE),
weight = runif(1000, min = 0.1, max = 2),
) |>
tidyr::pivot_longer(
cols = c(contains("att")),
names_to = "att_name",
values_to = "att_val"
)
nested_df <- toydata |>
nest_by(across(any_of(c("att_name"))),
.key = "nested_data")
print(nested_df)
#> # A tibble: 2 × 2
#> # Rowwise: att_name
#> att_name nested_data
#> <chr> <list<tibble[,2]>>
#> 1 att10val [1,000 × 2]
#> 2 att5val [1,000 × 2]
# Use rowwise operations
nested_designs <- nested_df |>
mutate(
design_object = as_survey_design(
.data = nested_data,
ids = 1,
probs = NULL,
strata = NULL,
fpc = NULL,
weights = 1
) |> list()
) |>
ungroup()
print(nested_designs)
#> # A tibble: 2 × 3
#> att_name nested_data design_object
#> <chr> <list<tibble[,2]>> <list>
#> 1 att10val [1,000 × 2] <tbl_svy[,2]>
#> 2 att5val [1,000 × 2] <tbl_svy[,2]>
# Or explicitly ungroup and then use map()
nested_df |>
ungroup() |>
mutate(
design_object = map(
.x = nested_data,
.f = function(df) {
as_survey_design(
.data = df,
ids = 1
)
}
)
)
#> # A tibble: 2 × 3
#> att_name nested_data design_object
#> <chr> <list<tibble[,2]>> <list>
#> 1 att10val [1,000 × 2] <tbl_svy[,2]>
#> 2 att5val [1,000 × 2] <tbl_svy[,2]>
Created on 2023-08-21 with reprex v2.0.2
Thanks for taking the time to explain this to me! That's all really helpful and solves the issue that I was having.
Hi -
Thanks for all your work on this really helpful package!
I am having some difficulty iterating as_survey_design over nested data in a function that I am writing. The following code throws an error saying
no applicable method for 'as_survey_design' applied to an object of class "c('double', 'numeric')"
. Do you have any idea what the problem is?