Open MichaelChirico opened 1 year ago
This test is flaky:
https://github.com/topepo/caret/blob/5f4bd2069bf486ae92240979f9d65b5c138ca8d4/pkg/caret/tests/testthat/test_Dummies.R#L122-L139C3
It fails whenever some entry from 1:15 is missing from sample.int(15, size = 100, replace = TRUE, prob = rep(1 / 15, 15)).
1:15
sample.int(15, size = 100, replace = TRUE, prob = rep(1 / 15, 15))
That happens about (probably exactly? too lazy to do the math) 1.5% of the time:
mean(replicate(1e6, all(1:15 %in% sample.int(15, size = 100, replace = TRUE, prob = rep(1 / 15, 15))))) # [1] [1] 0.984922
Observe:
# get an entry missing one of 1:15 repeat { entry <- sample.int(15, size = 100, replace = TRUE, prob = rep(1 / 15, 15)) if (!all(1:15 %in% entry)) break } # now finish the test data = data.frame(matrix(rep(as.factor(entry), 15), ncol = 15), stringsAsFactors = TRUE) essai_dummyVars = caret::dummyVars(stats::as.formula(paste0("~ ", colnames(data), collapse = "+")), data) exp_names_lvls <- apply(expand.grid(paste0("X",1:15), paste0(".",1:15)), 1, paste, collapse="") res_names_lvls <- colnames(predict(essai_dummyVars, data)) all(exp_names_lvls %in% res_names_lvls) # [1] FALSE
This test is flaky:
https://github.com/topepo/caret/blob/5f4bd2069bf486ae92240979f9d65b5c138ca8d4/pkg/caret/tests/testthat/test_Dummies.R#L122-L139C3
It fails whenever some entry from
1:15
is missing fromsample.int(15, size = 100, replace = TRUE, prob = rep(1 / 15, 15))
.That happens about (probably exactly? too lazy to do the math) 1.5% of the time:
Observe: