Neuraxio / Neuraxle

The world's cleanest AutoML library ✨ - Do hyperparameter tuning with the right pipeline abstractions to write clean deep learning production pipelines. Let your pipeline steps have hyperparameter spaces. Design steps in your pipeline like components. Compatible with Scikit-Learn, TensorFlow, and most other libraries, frameworks and MLOps environments.
https://www.neuraxle.org/
Apache License 2.0
608 stars 62 forks source link

Tree Parzen Estimator #387

Closed alexbrillant closed 4 years ago

alexbrillant commented 4 years ago

What it is

My pull request does: integrate tree parzen estimator coded by @Eric2Hamel inside Neuraxle !

How it works (TPE)

@Eric2Hamel coded it this way :

  1. Select initial trials hyperparams with random search.
  2. Split trials into good and bad using quantile threshold.
  3. For each hyperparams, create gaussian mixture of good and gaussian mixture of bads.
  4. For each hyperparams, sample possible new hyperparams in the good_trials posterior.

How scipy distribution work

Every hyperparams (scipy or not) needs to inherit from either DiscreteHyperparameterDistribution (inherits HyperparameterDistribution with is_continuous to False, and adds probabilities() & values() methods), or HyperparameterDistribution (has is_continuous property to True by default)

class Gaussian(BaseCustomContinuousScipyDistribution):
    def __init__(self, min_included: int, max_included: int, null_default_value: float = None):
        self.max_included = max_included
        self.min_included = min_included

        BaseCustomContinuousScipyDistribution.__init__(
            self,
            name='gaussian',
            min_included=min_included,
            max_included=max_included,
            null_default_value=null_default_value
        )

    def _pdf(self, x):
        return math.exp(-x ** 2 / 2.) / np.sqrt(2.0 * np.pi)

class Poisson(BaseCustomDiscreteScipyDistribution):
    def __init__(self, min_included: float, max_included: float, null_default_value: float = None, mu=0.6):
        super().__init__(
            min_included=min_included,
            max_included=max_included,
            name='poisson',
            null_default_value=null_default_value
        )
        self.mu = mu

    def _pmf(self, x):
        return math.exp(-self.mu) * self.mu ** x / factorial(x)

Example usage

Here is how you can use this new code as a end user:

auto_ml = AutoML(
        pipeline=pipeline,
        hyperparams_optimizer=TreeParzenEstimatorHyperparameterSelectionStrategy(
            number_of_initial_random_step=10,
            quantile_threshold=0.3,
            number_good_trials_max_cap=25,
            number_possible_hyperparams_candidates=100,
            prior_weight=0.,
            use_linear_forgetting_weights=False,
            number_recent_trial_at_full_weights=25
        ),
        validation_splitter=ValidationSplitter(0.5),
        scoring_callback=ScoringCallback(mean_squared_error, higher_score_is_better=False),
        callbacks=[
            MetricCallback('mse', metric_function=mean_squared_error, higher_score_is_better=False),
        ],
        n_trials=n_trials,
        refit_trial=True,
        epochs=n_epochs,
        hyperparams_repository=hp_repository
)
cla-bot[bot] commented 4 years ago

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Éric Hamel. This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails
cla-bot[bot] commented 4 years ago

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Éric Hamel. This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails
cla-bot[bot] commented 4 years ago

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Éric Hamel. This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails
cla-bot[bot] commented 4 years ago

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Éric Hamel. This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails
cla-bot[bot] commented 4 years ago

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Éric Hamel. This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails