Neuraxio / Neuraxle

The world's cleanest AutoML library ✨ - Do hyperparameter tuning with the right pipeline abstractions to write clean deep learning production pipelines. Let your pipeline steps have hyperparameter spaces. Design steps in your pipeline like components. Compatible with Scikit-Learn, TensorFlow, and most other libraries, frameworks and MLOps environments.
https://www.neuraxle.org/
Apache License 2.0
608 stars 62 forks source link

AutoMLSequentialWrapper #45

Closed guillaume-chevalier closed 4 years ago

guillaume-chevalier commented 5 years ago

Do something like this for meta_fit:

class AutoMLSequentialWrapper:

    def __init__(self, wrapped_pipeline, auto_ml_strategy, validation_technique, score_function, hyperparams_repository, n_iters): 

        self.toute = toute...

    def fit(self, di, eo): 

        for i in n_iters: 
            hps: List[HyperparameterSamples], scores: List[float] = hyperparams_repository.load_all()

            auto_ml_strategy = auto_ml_strategy.fit(hps, scores)

            next_model_to_try_hps = auto_ml_strategy.guess_next_best_params(i, n_iters, wrapped_pipeline.get_hyperparams_space())
            hyperparams_repository.register_new_untrained_trial(next_model_to_try_hps)

            validation_wrapper = validation_technique(copy(wrapped_pipeline).set_hyperparams(next_model_to_try_hps))
            validation_wrapper, predicted_eo = validation_wrapper.fit_transform(di, eo)

            score = score_function(predicted_eo, eo)  # TODO: review order of arguments here.

            hyperparams_repository.set_score_for_trial(next_model_to_try_hps, score)

I'd like to validate the OOP object structure. For instance, what will we do when we'll run trials in parallel? This for loop is not enough, it'd be more like a pool of workers that tries the N next best samples.

We also need a way to indicate that the trial crashed so that the auto_ml_strategy doesn't try that point again.

Any comments/suggestions on that @mlevesquedion @alexbrillant @Eric2Hamel?

Eric2Hamel commented 5 years ago

I look to your first draft and this looks good for most AutoMl algorithm like RandomSearch, TPE, GaussianProcess, Neuroevolution/GeneticAlgo, etc.

Just to make sure, when guessing the next best params (or when fitting) does the auto_ml_strategy has access to the probability distribution class (pdf, rvs, etc.)? Since TPE needs the probability distribution function (pdf) and needs to sample from probability distributions, to guess the next best params. And other auto_ml might will probably needs to rvs at least.

As for the trial crashes, I know that hyperopt used some kind of status to know if the trial has succeeded or not. We could use some kind of success flag also. It should come from the score_function or the validation_technique which needs to output a status flag.

Most of the automl technique can run in parallel. For instance, TPE needs the past history to suggest new points, but it is still faster to run multiple run in parallel and update the hyperparams_repository less quicker than waiting for each to complete before starting the next trial. As cited by this article : "The consequence of parallelization is that each proposal x∗ is based on less feedback. This makes search less efficient, though faster in terms of wall time". For parallelization, there should have some share hyperparams_repository for any Bayesian Optimization technique (GaussianProcess, TPE, Spearmint, etc.) and probably Genetic Algo also to have access to past history.

There is only hyperband that it is a bit particular. The algo speed up the search by first allocating fewer resources (epochs) to not promising training run and to allocate more and more resources on more promising training run each times. You will not be able to do Hyperband with this abstraction, but Hyperband is not the same kind of abstraction anyway.

So I think that this is a really good starting point.

guillaume-chevalier commented 5 years ago

@Eric2Hamel Thanks for the thoughts!

Good catch for the error status, that would for sure be a good thing to add status to the hyperparams_repository, perhaps that adding a try-catch (or a timeout when/if distributed?) would be interesting.

For the PDFs I thought about it and I'd need to add those methods in each of our Distribution classes, I'm not really sure of how to express/formulate those PDFs in the code of those distributions (help wanted). For now the auto_ml_strategy already have access to those distribution classes because we pass to it the result of the call of wrapped_pipeline.get_hyperparams_space which is an HyperparameterSpace dict containing the distribution instances.

Also yeah, probably that on suggesting the next trial to do that it could suggest a percentage of data to try on or something like that such as doing something like Hyperband, that'd be an interesting method to add to the auto_ml_strategy. However, if in Hyperband the next trials are the continuation of the first ones, then it'd perhaps need a new class different than AutoMLSequentialWrapper. I'd like to have suggestions on this too.

You also somehow remind me that I didn't include yet the wrapped_pipeline.inspect() method that I discussed in the conference where after training it'd be possible to save extra features of the trained model (e.g.: average and std of neurons' weights, loss' train curve and validation train curve, etc). That'd be another thing to add to the hyperparams_repository.

Eric2Hamel commented 5 years ago

Don't worry about de pdf, I will help you and this is local code so it is easier with the time I have to implement it.

Yeah could be a try, except and also a timeout could be a good thing.

About hyperband, it tries a lot of trial with not a lot of resources (for exemple 20epochs) take the a pourcentage of the best (for example half), retried that best for more epoch (for example (40 epochs) and retake a pourcentage of the best on and one until only one trial win. This needs to access to the number or resources like number of epoch and start trial which i think is not the same abstraction.

Le jeu. 5 sept. 2019 15 h 45, Guillaume Chevalier notifications@github.com a écrit :

@Eric2Hamel https://github.com/Eric2Hamel Thanks for the thoughts!

Good catch for the error status, that would for sure be a good thing to add status to the hyperparams_repository, perhaps that adding a try-catch (or a timeout when/if distributed?) would be interesting.

For the PDFs I thought about it and I'd need to add those methods in each of our Distribution https://www.neuraxle.neuraxio.com/stable/api/neuraxle.hyperparams.distributions.html#hyperparameter-distributions classes, I'm not really sure of how to express/formulate those PDFs in the code of those distributions (help wanted). For now the auto_ml_strategy already have access to those distribution classes because we pass to it the result of the call of wrapped_pipeline.get_hyperparams_space which is an HyperparameterSpace https://www.neuraxle.neuraxio.com/stable/api/neuraxle.hyperparams.space.html dict containing the distribution instances.

Also yeah, probably that on suggesting the next trial to do that it could suggest a percentage of data to try on or something like that such as doing something like Hyperband, that'd be an interesting method to add to the auto_ml_strategy. However, if in Hyperband the next trials are the continuation of the first ones, then it'd perhaps need a new class different than AutoMLSequentialWrapper. I'd like to have suggestions on this too.

You also somehow remind me that I didn't include yet the wrapped_pipeline.inspect() method that I discussed in the conference where after training it'd be possible to save extra features of the trained model (e.g.: average and std of neurons' weights, loss' train curve and validation train curve, etc). That'd be another thing to add to the hyperparams_repository.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Neuraxio/Neuraxle/issues/45?email_source=notifications&email_token=AMADFS7VG3JGYQSA57ST5BDQIFOVZA5CNFSM4ITVTDV2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6AQQ5Y#issuecomment-528550007, or mute the thread https://github.com/notifications/unsubscribe-auth/AMADFSZO6UIOC5L6Q4KSLX3QIFOVZANCNFSM4ITVTDVQ .

guillaume-chevalier commented 5 years ago

@Eric2Hamel Nice! You can definitely add the pdf of the hyperparameter distributions. I'd suggest doing it for one in a first PR for a first review and then proceeding to every other ones. I'm mostly unsure of how for example Hyperopt and other algorithms would query the PDF to use it (e.g.: min and max and mean? Continuous PDF function? Discrete array given a PDF resolution? etc.).

Also, it more and more sounds like Hyperband and Hyperband-like algorithms would be another meta wrapper step think. I think that to continue training it'd need to use a kind of minibatch pipeline and limit the number of epochs and resume from there with pipeline serialization between each steps.