Problems with 1se lambda

facebookexperimental / Robyn

Robyn is an experimental, AI/ML-powered and open sourced Marketing Mix Modeling (MMM) package from Meta Marketing Science. Our mission is to democratise modeling knowledge, inspire the industry through innovation, reduce human bias in the modeling process & build a strong open source marketing science community.

https://facebookexperimental.github.io/Robyn/

MIT License

1.14k stars 337 forks source link

Problems with 1se lambda #178

Closed Ocean199 closed 2 years ago

Ocean199 commented 3 years ago

Hello, Great job with Robyn.

I did notice some irregularities and over regularization with some datasets. I was able to trace it back to extraordinary high lamdas in the ridge regressions.

The 1se lamda is producing an average model across all pareto's at .25, when I compare it to lambda.min, has a fit of .95.

Decomp for 1se:

Decomp for min:

Perhaps giving the user the option to pick min or 1se would be nice. Also providing a max lambda option would be nice.

Thanks

gufengzhou commented 3 years ago

Hi, this is very valuable insight! Thanks for digging deep into Robyn!

As you've probably seen in glmnet's documentation, we're using lambda.1se because it's the recommended default setting. Are you using your own /real data? The huge fitting difference surprises me to be honest. But also very interesting to see.

I believe it makes sense to implement a custom regularization tuner as you recommended. What's your thought behind the lambda.max? I'm thinking about a tuner between 0-1 / lambda.min to lambda.1se

Ocean199 commented 3 years ago

Hello, my only thought to a max lambda would be, if the user knows a lambda from past models which they do not want to exceed. But I think min - 1se would be sufficient

gufengzhou commented 3 years ago

I noticed that you're probably still using the older 2.0 version, right? FYI Robyn is updated to 3.0 last week. Please check the new readme to see details. However the 3.0 will also have this lambda issue.

Regarding lambda, currently Robyn is just using the automatically generated lambdas from glmnet by default. But there's actually a ridge_lambda() function that can generate the lambda sequence for self-cross-validation. We'll consider how to do this best, because we want to avoid unnecessary friction and potential source of human bias during the modelling process. I'll let you know once we have a solution.

gufengzhou commented 3 years ago

FYI I've just implemented a new parameter lambda_control in the robyn_run function to tune between lambda.min and lambda.1se in the latest commit here ef1b7aa802d65adee164aa55c04ba965295790d1. This is not the perfect way to do it because it introduce friction and potential human bias into the modelling process. But I agree that it's good that user has to choice to do it themselves. I'll look into implementing lambda into the hyperparameter optimisation in the future to better accommodate our multiple objectives.

laresbernardo commented 2 years ago

Hi! In version 3.5 onwards, we are picking the best performing lambda as a new hyper-parameter added into nevergrad optimization. Please update and check this new feat; should be good enough to stop this particular issue.

laresbernardo commented 2 years ago

Closing this ticket now. Let us know if you need further help