kwartler / text_mining

This repo contains data from Ted Kwartler's "Text Mining in Practice With R" book.
53 stars 67 forks source link

glmnet intercept = FALSE #6

Closed bradleyboehmke closed 6 years ago

bradleyboehmke commented 6 years ago

Ted, in your book you apply intercept = F within your glmnet models. All the other parameters you discuss but you do not provide justification for this parameter. I'm curious as to why you chose this parameter in the headline click bait case study?

kwartler commented 6 years ago

Good question. The intercept parameter represents whether or not the model fit should have a y-intercept as a coefficient. In this case I set it to F so an intercept is not used, leaving you to review the actual impacts of the coefficients themselves. In my opinion (I am sure there is contrary sound arguments though) I like to use F for this isolation effect. If you have it set to T you will get the "state of nature" without any x -inputs (think if all x values are 0). In many cases F will decrease the accuracy of the model since you are now telling the model it has to work with 1 less input when performing the fit. Thus in practice you may want to evaluate two model builds with this parameter being changed. Below is a simple example:

# Load libraries and set a seed for reproducibility
library(glmnet)
library(MLmetrics)
set.seed(123)

We will use fake data that is simply an x matrix of 20 inputs and a y column of output.

# Fake data
x<-matrix(rnorm(100*20),100,20)
y<-rnorm(100)

Here we build two types of models with the specific parameter.

# Build 2 models
With_Intercept<-glmnet(x,y)
Without_Intercept<-glmnet(x,y, intercept=F)

This will print the coefficients of each model to your console. Keep in mind its a specific lambda, not the optimized one. Compare the coefficients values and notice how the intercept is missing in the second model

# Extract coefficients at a single value of lambda
coef(With_Intercept,s=0.01) 
coef(Without_Intercept,s=0.01) 

To illustrate my point about sometimes, though not every time, the model evaluating less favorably for intercept=F you can make predictions and evaluate the RMSE. Keep in mind that this isn't always the case and that the s value is declared (just something I picked).

# make predictions with single Lambda value
with_Preds<-predict(With_Intercept,newx=x[1:10,],s=c(0.01)) 
without_Preds<-predict(Without_Intercept,newx=x[1:10,],s=c(0.01)) 

#Simple Evaluation
RMSE(with_Preds, y[1:10])
RMSE(without_Preds,y[1:10])