Wrappers for discriminant analysis and naive Bayes models for use with the parsnip package
discrim_regularized tune error #19

royfrancis commented 3 years ago
df_split <- initial_split(iris,prob=0.80,strata=Species)
df_train <- training(df_split)

recipe_rda <- df_train %>%
  recipe(Species ~ .) %>%
  step_zv(all_predictors()) %>%

spec_rda <- discrim_regularized(frac_common_cov = tune(), frac_identity = tune()) %>% 
  set_mode("classification") %>% 

wf_rda <- workflow() %>%
  add_recipe(recipe_rda) %>%

grid_rda <- grid_regular(frac_common_cov(),frac_identity(),levels=5)
#> Error: Element `id` should have unique values. Duplicates exist for item(s): 'threshold'

royfrancis commented 3 years ago

So, it seems like both frac_common_cov() and frac_identity() creates columns with the same name threshold. Perhaps, they shouldn't be used together?

And if I decide to tune only 1 parameter, then I get another error further down when tuning.

spec_rda <- discrim_regularized(frac_common_cov = tune()) %>% 
  set_mode("classification") %>% 

wf_rda <- workflow() %>%
  add_recipe(recipe_rda) %>%

grid_rda <- grid_regular(frac_common_cov(),levels=5)

tune_rda <- tune_grid(wf_rda, resamples=df_train_cv, grid=grid_rda)

Error: The provided `grid` has the following parameter columns that have not been marked for tuning by `tune()`: 'threshold'.
Run `rlang::last_error()` to see where the error occurred.
topepo commented 3 years ago

That's a bug in those parameter definitions. For example:

frac_common_cov <- function(range = c(0, 1), trans = NULL) {
    type = "double",
    range = range,
    inclusive = c(TRUE, TRUE),
    trans = trans,
    default = 0.5,
    label = c(threshold = "Fraction of the Common Covariance Matrix"), # <- should not be named threshold
    finalize = NULL

In the meantime, this should work (giving them IDs):


df_split <- initial_split(iris,prob=0.80,strata=Species)
df_train <- training(df_split)

recipe_rda <- df_train %>%
   recipe(Species ~ .) %>%
   step_zv(all_predictors()) %>%

spec_rda <-
   discrim_regularized(frac_common_cov = tune("covar"),
                       frac_identity = tune("ident")) %>%
   set_mode("classification") %>%

wf_rda <- workflow() %>%
   add_recipe(recipe_rda) %>%

grid_rda <-
   grid_regular(list(covar = frac_common_cov(), ident = frac_identity()), 
                levels = 5)
royfrancis commented 3 years ago

The tuning still seems to be failing.

df_split <- initial_split(iris,prob=0.80,strata=Species)
df_train <- training(df_split)
df_train_cv <- vfold_cv(df_train,v=6,repeats=3,strata=Species)

recipe_rda <- df_train %>%
  recipe(Species ~ .) %>%
  step_zv(all_predictors()) %>%

spec_rda <-
  discrim_regularized(frac_common_cov = tune("covar"),
                      frac_identity = tune("ident")) %>%
  set_mode("classification") %>%

wf_rda <- workflow() %>%
  add_recipe(recipe_rda) %>%

grid_rda <-
  grid_regular(list(covar = frac_common_cov(), ident = frac_identity()), 
               levels = 5)

tune_rda <- tune_grid(wf_rda, resamples=df_train_cv, grid=grid_rda)
github-actions[bot] commented 3 years ago

