abjer / isds2020

Introduction to Social Data Science 2020 - a summer school course abjer.github.io/isds2020
58 stars 92 forks source link

Problem with 11.2.2 : The calculated RMSE's are all the same for every value of lambda #39

Open Johan-Christensen opened 4 years ago

Johan-Christensen commented 4 years ago

Hi, I am trying to complete problem 11.2.2. However, when I run the code below, I get the same RMSE for every value of lambda.

image

(Note: X_train_ps has polynomialfeatures added and is scaled with zero mean and std, same with X_test_ps)

This is my mse function: image

My compute_error function: image

jsr-p commented 4 years ago

hi @Johan-Christensen , you should define a new Lasso model with the given lambda value in each iteration. From your code it does not look like you utilize the lambdas for anything in your loop. Right now you are using your own coded estimator. Use the Lasso from sklearn.linear_model :)

Also, you are defining the list lambdas to contain 1000 values between 10^(-4) to 10^(4). Considering the index of your dataframe the difference in the value of two consecutive lambdas is miniscule. Thus we would not expect a big different in the RMSEs between two consecutive entries using the lamba values in your dataframe. Try to limit the length of the list to only 20 values (for your 💻's sake )

Johan-Christensen commented 4 years ago

Thanks!