signaux-faibles / predictsignauxfaibles

Dépôt du code python permettant la production de liste de prédiction Signaux Faibles.
MIT License
6 stars 1 forks source link

Use scikit-learn for GAMs #16

Open vviers opened 3 years ago

vviers commented 3 years ago

A huge step towards implementing GAMs in scikit-learn was taken 3 days ago when this PR was merged.

This will likely allow us to get rid of pyGAM (which is OK but not mature and stable enough imo, see for example #12) and to rely fully on the great scikit-learn ecosystem instead.

Anyone interested in giving it a go ?

lcrmorin commented 3 years ago

May I suggest trying glmnet ? (from the author of elements of statistical learning). It works in a sklearn frameworks. Basically offer logistic regression + an iterative framework for regularisation (-> variable selection) and hyperparameter tuning. The hyperparameter tuning is done on an information criteria so that it doesn't need an aditionnal framework for tunning. The loss of spline (ie mostly smoothing) can largely be compensated trough expert based feature selection + feature transformation.

Do you think it could be tried easily ? It could work ?

vviers commented 3 years ago

That's a great point ! If we end up using something different than GAMs then this package would be worth looking into :)

However, looking at the github repo for the package, it seems much less maintained and used than scikit-learn and so we would need a really good reason (like a feature that's critical to our work) to use it instead of scikit.

Do you have a specific example of something that glmnet does and that we cannot get in scikit-learn ?