logistic regression with glmnet engine when penalty = 0 #591

Open GilHenriques opened 1 year ago

GilHenriques commented 1 year ago

The problem

When implementing a logistic regression with glmnet, I encounter two issues that I believe to be related. The reproducible example below showcases both issues. The issues arise when (as a reliability check), I set penalty = 0. The purpose of the check was to confirm that mixture has no effect when penalty = 0).

In short, the issues are:

  1. Even though a penalty value is explicitly provided in the model specification -- logistic_reg(penalty = 0, mixture = tune()) -- I get a "no_penalty()" error when tuning the workflow. This error is also obtained for values of penalty different from zero.
  2. In an effort to avoid this error, I set penalty = tune() and then include penalty = 0 in my tuning grid. The code then runs, but contrary to my expectation, the mixture had an effect on accuracy and ROC AUC.
  3. When I implement a similar model directly in the glmnet package, I confirm that when lambda = 0 (no penalty), there is no effect of alpha (mixture), whereas when lambda is larger than zero, there is an effect of alpha. This appears inconsistent with point 2 above.

Reproducible example

``` r


# Create an example data frame
df <- tibble(Y = sample(c(1, 0), 1000, replace = TRUE),
       X1 = rnorm(1000),
       X2 = rnorm(1000),
       X3 = rnorm(1000),
       X4 = rnorm(1000)) |> 
  mutate(Y = factor(Y))

# Initial split
splits <- initial_split(df)
train <- training(splits)
folds <- vfold_cv(train)

# Issue 1: No penalty error, even though a penalty is specified
model <- logistic_reg(penalty = 0, mixture = tune())|> set_engine('glmnet')
rec <- recipe(Y ~ ., data = train)
wflow <- workflow() |> add_model(model) |> add_recipe(rec)

wflow |> tune_grid(folds)
#> Error in `no_penalty()`:
#> ! At least one penalty value is required for glmnet.

#> Backtrace:
#>      ▆
#>   1. ├─tune::tune_grid(wflow, folds)
#>   2. └─tune:::tune_grid.workflow(wflow, folds)
#>   3.   └─tune:::tune_grid_workflow(...)
#>   4.     └─tune:::tune_grid_loop(...)
#>   5.       └─tune (local) fn_tune_grid_loop(...)
#>   6.         └─tune:::tune_grid_loop_impl(...)
#>   7.           └─tune:::compute_grid_info(workflow, grid)
#>   8.             └─tune:::compute_grid_info_model(workflow, grid, parameters_model)
#>   9.               ├─generics::min_grid(spec, grid)
#>  10.               └─tune::min_grid.logistic_reg(spec, grid)
#>  11.                 └─tune:::no_penalty(grid, sub_nm)
#>  12.                   └─rlang::abort("At least one penalty value is required for glmnet.")
# Error in `no_penalty()`:
# ! At least one penalty value is required for glmnet

# Issue 2: If penalty = 0 in the tuning grid, mixture still has an effect
model <- logistic_reg(penalty = tune(), mixture = tune())|> set_engine('glmnet')
rec <- recipe(Y ~ ., data = train)
wflow <- workflow() |> add_model(model) |> add_recipe(rec)

reg_grid <- expand_grid(penalty = 0, mixture = c(0.001, 0.01, 0.1, 0.25, 0.5, 0.6))

wflow |> tune_grid(folds, grid = reg_grid) |> 
  autoplot() # Parameter makes a difference even though penalty = 0

# Issue 3: When we use glmnet directly, if lambda = 0 alpha makes no difference
X <- df[1:500,-1] |> as.matrix()
Y <- df[1:500,] |> pull(Y)
fit1 <- glmnet::glmnet(X, Y, family = 'binomial', lambda = 0, alpha = 0.001)
fit2 <- glmnet::glmnet(X, Y, family = 'binomial', lambda = 0, alpha = 0.1)
fit3 <-  glmnet::glmnet(X, Y, family = 'binomial', lambda = 0, alpha = 0.5)

pred1 <- predict(fit1, newx = as.matrix(df[500:1000,-1]), type = 'class') |> as.vector()
pred2 <- predict(fit2, newx = as.matrix(df[500:1000,-1]), type = 'class') |> as.vector()
pred3 <- predict(fit3, newx = as.matrix(df[500:1000,-1]), type = 'class') |> as.vector()

tibble((df[500:1000,1]), pred1, pred2, pred3) |> 
  mutate(Y = as.character(Y)) |> 
  summarize(accuracy1 = sum(pred1 == Y)/n(),
            accuracy2 = sum(pred2 == Y)/n(),
            accuracy3 = sum(pred3 == Y)/n())
#> # A tibble: 1 × 3
#>   accuracy1 accuracy2 accuracy3
#>       <dbl>     <dbl>     <dbl>
#> 1     0.507     0.507     0.507

# ... But if lambda > 0 alpha does make a difference
X <- df[1:500,-1] |> as.matrix()
Y <- df[1:500,] |> pull(Y)
fit1 <- glmnet::glmnet(X, Y, family = 'binomial', lambda = 0.1, alpha = 0.001)
fit2 <- glmnet::glmnet(X, Y, family = 'binomial', lambda = 0.1, alpha = 0.1)
fit3 <-  glmnet::glmnet(X, Y, family = 'binomial', lambda = 0.1, alpha = 0.5)

pred1 <- predict(fit1, newx = as.matrix(df[500:1000,-1]), type = 'class') |> as.vector()
pred2 <- predict(fit2, newx = as.matrix(df[500:1000,-1]), type = 'class') |> as.vector()
pred3 <- predict(fit3, newx = as.matrix(df[500:1000,-1]), type = 'class') |> as.vector()

tibble((df[500:1000,1]), pred1, pred2, pred3) |> 
  mutate(Y = as.character(Y)) |> 
  summarize(accuracy1 = sum(pred1 == Y)/n(),
            accuracy2 = sum(pred2 == Y)/n(),
            accuracy3 = sum(pred3 == Y)/n())
#> # A tibble: 1 × 3
#>   accuracy1 accuracy2 accuracy3
#>       <dbl>     <dbl>     <dbl>
#> 1     0.509     0.489     0.491

Created on 2022-12-09 with reprex v2.0.2

simonpcouch commented 10 months ago

Thank you for the issue! Just wanted to let you know this hasn't fallen off our radar. Related to https://github.com/tidymodels/tune/issues/28 and https://github.com/tidymodels/tune/issues/45.

marcozanotti commented 8 months ago

+1 Thank you @simonpcouch By the moment is there any way to solve it?