Is it possible to generate values in a multi-select environment? This could apply to survey research (select all that apply), or graphs (edges between nodes of a certain type).
Below is a really hacky function that demonstrates this use-case in a hard-coded way
## helper function: obviously not a production-quality function
build_multi = function(ids) {
df_data = data.frame()
for (i in 1:length(ids)) {
## randomize how many choices are made
n_obs = sample(x=1:2, size=1, prob = c(.75, .25))
## what are the choices available in the multi-select
cvals = c("BIZ","ARTS","SCIENCE","HEALTH","OTHER")
vals = sample(x = cvals,
replace = FALSE,
prob = c(.4,.2,.15,.2, .05),
size = n_obs)
tmp_df = data.frame(id = ids[i],
values = vals)
df_data = dplyr::bind_rows(df_data, tmp_df)
}
return(df_data)
}
## generate a set of ids
my_df = r_data_frame(id = id, n=100)
and generate the data
## return a long dataset of multiselect options for given probabilities (which I hardcoded)
my_long = build_multi(my_df$id)
Above generates the dataset in a structure that I would need, but I wasn't sure if this already existed in the current package.
Is it possible to generate values in a multi-select environment? This could apply to survey research (select all that apply), or graphs (edges between nodes of a certain type).
Below is a really hacky function that demonstrates this use-case in a hard-coded way
and generate the data
Above generates the dataset in a structure that I would need, but I wasn't sure if this already existed in the current package.