intel-analytics / analytics-zoo

Distributed Tensorflow, Keras and PyTorch on Apache Spark/Flink & Ray
https://analytics-zoo.readthedocs.io/
Apache License 2.0
11 stars 3 forks source link

refactor recipe for automl models #592

Open shanyu-sys opened 3 years ago

shanyu-sys commented 3 years ago

Currently there are some inconvenience for automl recipe.

  1. Each model has one or more recipes and they are all in automl.config.recipe.py. Adding a new model includes adding model in automl.model as well as adding a recipe in the mixed recipe file.
  2. The search algo and scheduler should not bind particular model.
  3. It is not convenience to support model selection. Since user should specify recipe for fit, and each recipe is designed for specific model, like LSTMGridSearchRecipe, MTNetRandomRecipe..., the model is fixed with the recipe. And passing a list of recipes corresponding to a list of models user wants to search may also not be a good idea.

Initial refactor idea is

  1. Each model estimator has a static method of get_search_space
  2. user can use model.set_search_space to change the default search space of certain model
  3. add include_estimators/exclude_estimators in TimeSequencePredictor (such as in constructor) for user to select a subset of model estimators to search.
  4. Better wrapper for parameter search for search space. Currently we use tune param search functions like tune.grid_search..., and a log param needs special workaround.
shanyu-sys commented 3 years ago

steps