Closed andrewpbray closed 1 year ago
Hey Andrew!
Note that integers are numerics:
is.numeric(1L)
#> [1] TRUE
so this isn’t actually an issue with integers vs floats.
It looks like the error you saw here is:
Error:
weight_by
must be ‘numeric vector with lengthnrow(.data)
= 5’, not ‘closure’.
This is because rep_slice_sample
doesn’t support tidyselect, so needs the actual weights vector df$id
rather than the tidyselect spec id
. Passing the actual weights works fine:
library(tidyverse)
library(infer)
df <- tibble(id = 1:5,
letter = factor(c("a", "b", "c", "d", "e")))
rep_slice_sample(df, n = 2, reps = 5,
weight_by = df$id)
#> # A tibble: 10 × 3
#> # Groups: replicate [5]
#> replicate id letter
#> <int> <int> <fct>
#> 1 1 4 d
#> 2 1 5 e
#> 3 2 5 e
#> 4 2 4 d
#> 5 3 3 c
#> 6 3 5 e
#> 7 4 5 e
#> 8 4 1 a
#> 9 5 4 d
#> 10 5 5 e
If we feel it's in scope for the package to support tidy selection here as well, I'm up for it!
Created on 2023-03-08 with reprex v2.0.2
Oh, I'm so glad you reviewed this! That's news to me that integers are numerics.
Since this function was adapted (from rep_sample_n()
) to mirror changes to dplyr
, to me makes sense to keep that parallel going in terms of tidyselect. Do you think it could just be implemented here or if we should really back propagate to any function where tidyselect could play a role? Seeing as how this is just one of the ancillary functions, maybe we could restrict the implementation of tidyselect to just this function?
It'd feel fine to me to just implement that change only to weight_by
in this function!
(Ope, I wrote tidyselect when I meant data-masking. Woopsies.😆) We already use something data-masking-esque in specify()
and now generate()
(via the variables
argument). generate()
uses a check_cols()
helper that makes this happen, so we can piggyback off of that machinery.
Sounds good to me!
Do you have time to do this? If not, I can take a swipe but it'd likely be awhile before I'd have the brain-space to reacquaint myself with the codebase.
Yup, sure thing!🥑
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.
I'm trying my hand at teaching an intro course using purely
rep_slice_sample()
and an as yet to be writtenrep_shuffle_col()
or something of the sort. This morning I ran into an unexpected behavior ofrep_slice_sample()
, particular since it differs fromslice_sample()
. I will have time later today to look into a fix but just thought I'd put up an issue to track it.Shoutout to @mine-cetinkaya-rundel . That function that you wrote is still going strong like 8 years later!
Created on 2023-03-07 by the reprex package (v2.0.1)