SelfExplainML / PiML-Toolbox

PiML (Python Interpretable Machine Learning) toolbox for model development & diagnostics
https://selfexplainml.github.io/PiML-Toolbox
Apache License 2.0
912 stars 109 forks source link

Parameter Tuning #17

Closed xloffree closed 1 year ago

xloffree commented 1 year ago

Hi,

Is there a way to tune the parameters for the built-in models in PiML? For example, the L1 and L2 penalties for the GLM model default to 0. Is there any built-in way to tune these parameters or would we need to use external methods?

Thank you!

ZebinYang commented 1 year ago

Hi @xloffree,

For the time being, internal models including GLM, GAM, and DT can not be directly tuned by HPO packages like GridSearchCV in sklearn.

You may use the low-code interface to change their parameters and run them one by one, or write a script like:

''' from piml.models import GLMRegressor for l1 in [0.01, 0.1, 1]: for l2 in [0.01, 0.1, 1]: exp.model_train(GLMRegressor(l1_regularzation=l1, l2_regularzation=l2), name="glm" + str(l1) + str(l2)) '''

ZebinYang commented 1 year ago

Hi @xloffree ,

Starting from v0.4.0, you can do hyperparameter tuning for all built-in models in PiML using external HPO frameworks.

For example,

from piml.models import GLMRegressor
from sklearn.model_selection import GridSearchCV
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split

data = fetch_california_housing()
train_x, test_x, train_y, test_y = train_test_split(data.data, data.target, test_size=0.2)
grid = GridSearchCV(GLMRegressor(), param_grid={"l1_regularzation": [0, 0.1, 0.2, 0.3],
                                                "l2_regularzation": [0, 0.1, 0.2, 0.3]})
grid.fit(train_x, train_y)