mljar / mljar-supervised

Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation
https://mljar.com
MIT License
2.99k stars 400 forks source link

Add Custom Models #430

Open huanvo88 opened 3 years ago

huanvo88 commented 3 years ago

Hello,

I was wondering if there is a way to use custom models with the Mljar framework? Let's say if I develop a model compatible with sklearn Pipeline

https://scikit-learn.org/stable/developers/develop.html

can I use that model in mljar for the Optuna mode?

If not, is there a framework to add custom models to mljar?

Another question is can we select the algorithms to be run in Optuna mode? Let's say I only want to run Xgboost, lightgbm, catboost to save time.

Thanks

pplonski commented 3 years ago

Hi @huanvo88,

You can use selected algorithms in Optuna mode by setting algorithms argument, example:

automl = AutoML(mode='Optuna', algorithms=['Xgboost', 'CatBoost', 'LightGBM'], optuna_time_budget=600)
automl.fit(X, y)

In the example above, there will be tuned 3 algorithms (Xgboost, CatBoost, and LightGBM). Each algorithm will be tuned by Optuna for 600 seconds -> total tuning time will be 3*600 seconds.

There is no framework to add custom algorithm to MLJAR (but it is possible to implement).

Do you have some repository with your algorithm implementation?

huanvo88 commented 3 years ago

Thanks @pplonski we use the two open source python implementations for GAM:

Our main algorithm: https://github.com/dswah/pyGAM

Another one that we might be interested in is https://github.com/interpretml/interpret

It is not hard to wrap these models to make it compatible with sklearn Pipeline. However I am not sure how to make it compatible with mljar

pplonski commented 3 years ago

@huanvo88 in the future I would like to refactor the code in the Optuna mode, and this would be a good time to create a framework for adding custom algorithms. However, it's hard to tell when it will be. I also started to work on MLJAR notebook with with support for visual programming and easy deployments and right now my all efforts goes there.

This issue is connected with https://github.com/mljar/mljar-supervised/issues/414