produvia / kryptos

Kryptos AI is a virtual investment assistant that manages your cryptocurrency portfolio
http://twitter.com/kryptos_ai
MIT License
48 stars 8 forks source link

Automated Machine Learning #25

Open slavakurilyak opened 6 years ago

slavakurilyak commented 6 years ago

Goal

As a machine learning developer, I want to embed automated methods for machine learning, so that I can run systematic processes on raw data and select relevant models for processing data.

As a machine learning developer, I want to integrate automated machine learning, so that I do not have to manually select and tune parameters and hyperparameters for machine learning models.

Consider

Inspiration

AirBnB

Airbnb uses Automated Machine Learning (AML) to accomplish the following:

  1. Benchmarking

Unbiased presentation of challenger models: AML can quickly present a plethora of challenger models using the same training set as your incumbent model. This can aid the data scientist in choosing the best model family.

  1. Diagnostics And Exploration

Detecting Target Leakage: because AML builds candidate models extremely fast in an automated way, we can detect data leakage earlier in the modeling lifecycle.

Diagnostics: As mentioned earlier, canonical diagnostics can be automatically generated such as learning curves, partial dependence plots, feature importances, etc.

  1. Automation

Tasks like exploratory data analysis, pre-processing of data, hyper-parameter tuning, model selection and putting models into production can be automated to some some extent with an Automated Machine Learning framework.

DataRobot

Here is the standard machine learning process at a high level.

screen-shot-2018-03-21-at-10 50 46-am-e1521807708174-1024x544

When developing a model with the traditional process, as you can see from Figure (above), the only automatic task is model training. Automated machine learning software automatically executes all the steps outlined in red – manual, tedious modeling tasks that used to require skilled data scientists. The traditional process often takes weeks or months, but with automated machine learning, it takes days at most for business professionals and data scientists to develop and compare dozens of models, find insights and predictions, and solve more business problems much faster. (DataRobot, 2018 )

bukosabino commented 5 years ago

for hyper parameter optimization: https://github.com/HunterMcGushion/hyperparameter_hunter

slavakurilyak commented 5 years ago

@bukosabino For additional inspiration, check out the following three Github topics: automated-machine-learning, auto-ml, automl

slavakurilyak commented 5 years ago

Consider using jhfjhfj1's autokeras or automl's SMAC3 libraries as well

slavakurilyak commented 5 years ago

Google releases AdaNet, which incorporates AutoML with Tensorflow

slavakurilyak commented 5 years ago

Microsoft releases nni, an open source AutoML toolkit for neural architecture search and hyper-parameter tuning, with support for Keras, Tensorflow, Pytorch.

slavakurilyak commented 5 years ago

Here's a list of open-source projects for AutoML: https://github.com/slavakurilyak/awesome-automl-papers#projects