sta-363-s20 / community

Discussion, Q&A, everything you want to say, formatted nicely
1 stars 0 forks source link

Ex 3 Lab 5 #67

Closed midupree closed 4 years ago

midupree commented 4 years ago

@LucyMcGowan I had a quick question on Number 3! I'm getting a little confused on the second part, which asks us to fit age using a natural spline. Use tune() to decide how many degrees of freedom to use for the age variable. I think I got the first part of creating a recipe, but I'm having trouble understanding how to use the natural spline. I've been searching through the slides with no avail. Is there any place I should look for reference to help me solve this problem? Here is what I have so far:

Screen Shot 2020-03-25 at 7 32 49 PM

Thanks so much for all you do, Miriam

LucyMcGowan commented 4 years ago

The slides from class on Tuesday show you how to use the natural spline with step_ns. The code you have there is doing penalized regression, this should just be regular linear model (using lm as the engine)

msherrick commented 4 years ago

Hi @LucyMcGowan,

I also have a question on this exercise. I used the following code to create my recipe:

wage_recipe <- Wage %>%
  recipe(wage ~ age, health_ins, jobclass, education, race) %>%
  step_ns(age,deg_free = tune())

wage_recipe

Screen Shot 2020-03-25 at 8 35 25 PM

Is this output giving me the degrees of freedom, such that I would say that we would use 1 degree of freedom for age (the predictor row)?

LucyMcGowan commented 4 years ago

That is the correct recipe. Now you need to do cross validation & tune the model to get the degrees of freedom

midupree commented 4 years ago

@LucyMcGowan Okay so I was able to create a recipe and have the same question. For cross validation, would we use the coding from the cross-validation slides? I've attached. Would this give us the degrees of freedom once we incorporate the tune function?

Screen Shot 2020-03-26 at 12 10 00 PM
michaeljurgens commented 4 years ago

Hi @LucyMcGowan I had exercise 3 working yesterday but today when I try to run it I'm getting this error and I'm not sure what it means, I didn't change any of the code from when it was running

Screen Shot 2020-03-26 at 12 52 33 PM
LucyMcGowan commented 4 years ago

@midupree no, for cross validation you can use similar code that you used in lab 4 (incorporating vfold_cv with tune_grid)

LucyMcGowan commented 4 years ago

@michaeljurgens hmmm that is strange - can you try restarted your R session? (In the menu bar click Session > Restart R )

midupree commented 4 years ago

So I noticed that this may actually be the coding for question 4, but I am

  1. not sure whether I've actually answered question 3
  2. not sure how to fix this problem. I double checked with the other thread and I made all the same corrections as them to fix the model failure.
Screen Shot 2020-03-26 at 3 17 10 PM
msherrick commented 4 years ago

I am also unsure if I have answered exercise 3 completely. In response to your last comment, are you just looking for the recipe in exercise 3, and then using tune to "decide how many degrees of freedom to use for the age variable" through cross validation and tuning the model all in exercise 4?

LucyMcGowan commented 4 years ago
midupree commented 4 years ago

@LucyMcGowan I think I may have finally figured it out; Would we take the lowest rmse and then report the degrees of freedom listed there?

Screen Shot 2020-03-26 at 6 30 51 PM Screen Shot 2020-03-26 at 6 30 57 PM
LucyMcGowan commented 4 years ago

Yes exactly!

midupree commented 4 years ago

@LucyMcGowan Is there a difference between the rsq and rmse? Or do I just need to choose the model with the lowest value that is labeled rmse?

LucyMcGowan commented 4 years ago

Yes, rsq is R-squared, rmse is the root mean squared error. You should use rmse for this.