tensorflow / adanet

Fast and flexible AutoML with learning guarantees.
https://adanet.readthedocs.io
Apache License 2.0
3.47k stars 529 forks source link

What's the difference between `adanet.Estimator` and `adanet.AutoEnsembleEstimator` #134

Open zhiqwang opened 4 years ago

zhiqwang commented 4 years ago

Hi, adanet team, I'm confused between the API of adanet.Estimator and adanet.AutoEnsembleEstimator. I noticed that adanet.AutoEnsembleEstimator was released in the version 0.4, but the tutorial provided here are all adanet.Estimator. Is there any suggestion to choose this two API?

jeffltc commented 4 years ago

In addition to that question, when I use adanet.AutoEnsembleEstimator to train model from candidate pool of four different dnn subnetwork, I always get the result network with only one subnetwork. Don't know if I have done something wrong with AutoEnsembleEstimator or it is just how it works. From my understanding, the adanet.AutoEnsembleEstimator can ensemble the subnetwork auutomatically. Does that mean I can only ensemble subnetwork with adanet.Estimator like how it works in the tutorial?

Here is my code and the result of the code.

# Lint as: python3
import numpy as np
import tensorflow as tf
from time import time
from datetime import datetime

from absl import app
import adanet

def main(args):
  (x_train, y_train), (x_test, y_test) = (
      tf.keras.datasets.boston_housing.load_data())

  def input_fn(partition):

    def _input_fn():
      feat_tensor_dict = {}
      if partition == 'train':
        x = x_train.copy()
        y = y_train.copy()
      else:
        x = x_test.copy()
        y = y_test.copy()
      for i in range(0, np.size(x, 1)):
        feat_nam = ('feat' + str(i))
        feat_tensor_dict[feat_nam] = tf.convert_to_tensor(
            x[:, i], dtype=tf.float32)
      label_tensor = tf.convert_to_tensor(y, dtype=tf.float32)
      return (feat_tensor_dict, label_tensor)

    return _input_fn

  feat_nam_lst = ['feat' + str(i) for i in range(0, np.size(x_train, 1))]

  feature_columns = []
  for item in feat_nam_lst:
    feature_columns.append(tf.feature_column.numeric_column(item))

  head = tf.estimator.RegressionHead(1)

  lr_estimator = tf.estimator.LinearEstimator(
      head=head, feature_columns=feature_columns)

  dnn_estimator_1 = tf.estimator.DNNRegressor(
      feature_columns=feature_columns, hidden_units=[5])

  dnn_estimator_2 = tf.estimator.DNNRegressor(
      feature_columns=feature_columns, hidden_units=[5, 5])

  dnn_estimator_3 = tf.estimator.DNNRegressor(
      feature_columns=feature_columns, hidden_units=[100,100])

  dnn_estimator_4 = tf.estimator.DNNRegressor(
      feature_columns=feature_columns, hidden_units=[50, 1500])

  folder_dir =  "/Users/zhangjue/Desktop/autoensemble/" 
  logdir_adanet = folder_dir + "adanet/" +datetime.now().strftime("%Y%m%d-%H%M%S")

  config = tf.estimator.RunConfig(model_dir=logdir_adanet)
  estimator = adanet.AutoEnsembleEstimator(
      head=head,
      candidate_pool=lambda config: {
          'dnn1': dnn_estimator_1,
          'dnn2': dnn_estimator_2,
          'dnn3': dnn_estimator_3,
          'dnn4': dnn_estimator_4
      },
      max_iteration_steps=5000,
      config=config)

  train_spec = tf.estimator.TrainSpec(
          input_fn=input_fn(partition = "train")
          ,max_steps = 5000)
  eval_spec = tf.estimator.EvalSpec(
          input_fn=input_fn(partition='test'))

  result,_ = tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
  print(result)

if __name__ == "__main__":
  app.run(main)

Here is the result.

{'architecture/adanet/ensembles': b'\n/\n\x13architecture/adanetB\x0e\x08\x07\x12\x00B\x08| dnn4 |J\x08\n\x06\n\x04text', 'average_loss': 28.492062, 'best_ensemble_index_0': 3, 'iteration': 0, 'label/mean': 23.078432, 'loss': 28.492102, 'prediction/mean': 22.895338, 'global_step': 5000}
cweill commented 4 years ago

@jeffltc If you want to do multiple boosting rounds, make sure that max_steps > max_iteration_steps. For example if max_steps == max_iteration_steps, you will only do one round (no boosting). if max_steps == 3 * max_iteration_steps, then it will boost for three rounds.

You should also create your estimators within the candidate_pool lambda, use an adanet.Evaluator for evaluating candidate performance, and ensemble_strategies for trying different ensemble techniques. For example:

# Lint as: python3
import numpy as np
import tensorflow as tf
from time import time
from datetime import datetime

from absl import app
import adanet

def main(args):
  (x_train, y_train), (x_test, y_test) = (
      tf.keras.datasets.boston_housing.load_data())

  def input_fn(partition):

    def _input_fn():
      feat_tensor_dict = {}
      if partition == 'train':
        x = x_train.copy()
        y = y_train.copy()
      else:
        x = x_test.copy()
        y = y_test.copy()
      for i in range(0, np.size(x, 1)):
        feat_nam = ('feat' + str(i))
        feat_tensor_dict[feat_nam] = tf.convert_to_tensor(
            x[:, i], dtype=tf.float32)
      label_tensor = tf.convert_to_tensor(y, dtype=tf.float32)
      return (feat_tensor_dict, label_tensor)

    return _input_fn

  feat_nam_lst = ['feat' + str(i) for i in range(0, np.size(x_train, 1))]

  feature_columns = []
  for item in feat_nam_lst:
    feature_columns.append(tf.feature_column.numeric_column(item))

  head = tf.estimator.RegressionHead(1)

  folder_dir =  "/Users/zhangjue/Desktop/autoensemble/" 
  logdir_adanet = folder_dir + "adanet/" +datetime.now().strftime("%Y%m%d-%H%M%S")

  config = tf.estimator.RunConfig(model_dir=logdir_adanet)
  estimator = adanet.AutoEnsembleEstimator(
      head=head,
      ensemble_strategies=[
          adanet.ensemble.GrowStrategy(), 
          adanet.ensemble.AllStrategy(),
      ],
      candidate_pool=lambda config: {
         "lr": tf.estimator.LinearEstimator(
                 head=head, feature_columns=feature_columns, config=config),
         "dnn1": tf.estimator.DNNRegressor(
               feature_columns=feature_columns, hidden_units=[5], config=config),
         "dnn2": tf.estimator.DNNRegressor(
                    feature_columns=feature_columns, hidden_units=[5, 5], config=config),
         "dnn3": tf.estimator.DNNRegressor(
                   feature_columns=feature_columns, hidden_units=[100,100], config=config),
         "dnn4":  tf.estimator.DNNRegressor(
                  feature_columns=feature_columns, hidden_units=[50, 1500], config=config),
      },
      max_iteration_steps=5000,
      evaluator=adanet.Evaluator(input_fn=input_fn(partition='test')),
      config=config)

  train_spec = tf.estimator.TrainSpec(
          input_fn=input_fn(partition = "train"),
          max_steps = 5000 * 3)
  eval_spec = tf.estimator.EvalSpec(
          input_fn=input_fn(partition='test'))

  result,_ = tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
  print(result)

if __name__ == "__main__":
  app.run(main)
jeffltc commented 4 years ago

@cweill Thank you so much! Crystal clear! Could you please introduce the difference between adanet.Estimator and adanet.AutoEnsembleEstimator? Sorry for asking the additional question and misleading the original question.

cweill commented 4 years ago

@jeffltc I'm glad I could help!

@zhiqwang: AutoEnsembleEstimator and adanet.Estimator are very similar.

AutoEnsembleEstimator is a thin wrapper around the latter which converts tf.estimator.Estimator instances into adanet.subnetwork.Builders for adanet.Estimator to train and combine into ensembles.

If you already have a tf.estimator.Estimator you want to ensemble, you should use AutoEnsembleEstimator. But if you want more control, or to do something more sophisticated with the TensorFlow graph, you can use the adanet.Estimator.