tensorflow / adanet

Fast and flexible AutoML with learning guarantees.
https://adanet.readthedocs.io
Apache License 2.0
3.47k stars 527 forks source link

How-to serve #22

Open JimAva opened 6 years ago

JimAva commented 6 years ago

How-to question: I've been testing and learning with the "adanet_objective" sample. How do you use the recommended model to run predictions samples and eventually serve for live data feed?

cweill commented 6 years ago

If you want to test predictions on a few batches or a tf.data.Dataset you can use Estimator.predict.

In order to serve for a live data feed, you first need to call Estimator.export_saved_model. Then you can serve the exported tf.SavedModel using TensorFlow serving.

raijinspecial commented 6 years ago

Sorry for a dumb question, but if it will keep me and anyone else having this issue from banging their head against the keyboard, I will ask it.

I've been testing the tutorial model for the boston house prices and am also trying to get Estimator.predict to predict. I can export the model, but calling predict always gives one of two errors.

final_prediction = adanet.Estimator.predict(input_fn, test_features, checkpoint_path=checkpoint_path) yields final_prediction as a generator

calling next(final_prediction) results in

AttributeError                            Traceback (most recent call last)
<ipython-input-124-104586511cd7> in <module>
----> 1 next(final_prediction)

c:\users\raiji\anaconda3\envs\wat\lib\site-packages\tensorflow\python\estimator\estimator.py in predict(self, input_fn, predict_keys, hooks, checkpoint_path, yield_single_examples)
    570                      'initialization to predict.'.format(self._model_dir))
    571       with ops.Graph().as_default() as g:
--> 572         random_seed.set_random_seed(self._config.tf_random_seed)
    573         self._create_and_assert_global_step(g)
    574         features, input_hooks = self._get_features_from_input_fn(

AttributeError: 'function' object has no attribute '_config'

Giving input_fn with partition=test: adanet.Estimator.predict(input_fn=input_fn, checkpoint_path=checkpoint_path) gives this error

TypeError                                 Traceback (most recent call last)
<ipython-input-118-982e61286970> in <module>
----> 1 final_prediction = adanet.Estimator.predict(input_fn=input_fn, checkpoint_path=checkpoint_path)

TypeError: predict() missing 1 required positional argument: 'self'

I get the same errors calling tf.estimator.Estimator.predict .

I assume I'm missing something trivial. Thanks for the great work!

JimAva commented 6 years ago

It would be great if we could get the sample code on exporting/saving model and prediction. Having a difficult time figuring these steps out. Thank you in advance.

cweill commented 6 years ago

You need to call these methods on your estimator = adanet.Estimator() instance, not on the class. So change final_prediction = adanet.Estimator.predict(input_fn) to be final_prediction = estimator.predict(input_fn)

JimAva commented 6 years ago

raijinspecial - how did you save your model? Would you mind sharing the code for saving and predicting? Thank you guys.

cweill commented 6 years ago

The following test goes through the full lifecycle of the adanet.Estimator:

https://github.com/tensorflow/adanet/blob/v0.3.0/adanet/core/estimator_test.py#L555

You can use it as a reference.

JimAva commented 6 years ago

Thank you for the reference. I'm not a python guru so I'm still struggling to get the code working for prediction and saving. Any way to provide working code after running the "Adanet_objective.ipynb" code? It'd be greatly appreciated for such a great project.

Thank you in advance.

raijinspecial commented 6 years ago

Ok, so I figured out how to make predictions with the regression model. The tutorial example wraps the estimator in the train_and_evaluate function, so trying to call it from outside will obviously not work. Here is a simple example exposing estimator:

#@title AdaNet parameters
LEARNING_RATE = 0.0015  #@param {type:"number"}
TRAIN_STEPS = 100000  #@param {type:"integer"}
BATCH_SIZE = 32  #@param {type:"integer"}

LEARN_MIXTURE_WEIGHTS = True  #@param {type:"boolean"}
ADANET_LAMBDA = 0.018  #@param {type:"number"}
BOOSTING_ITERATIONS = 8  #@param {type:"integer"}

def train_and_evaluate(learn_mixture_weights=LEARN_MIXTURE_WEIGHTS,
                       adanet_lambda=ADANET_LAMBDA):
    return tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)

estimator = adanet.Estimator(
      # Since we are predicting housing prices, we'll use a regression
      # head that optimizes for MSE.
      head=tf.contrib.estimator.regression_head(
          loss_reduction=tf.losses.Reduction.SUM_OVER_BATCH_SIZE),

      # Define the generator, which defines our search space of subnetworks
      # to train as candidates to add to the final AdaNet model.
      subnetwork_generator=SimpleDNNGenerator(
          optimizer=tf.train.RMSPropOptimizer(learning_rate=LEARNING_RATE),
          learn_mixture_weights=LEARN_MIXTURE_WEIGHTS,
          seed=RANDOM_SEED),

      # Lambda is a the strength of complexity regularization. A larger
      # value will penalize more complex subnetworks.
      adanet_lambda=ADANET_LAMBDA,

      # The number of train steps per iteration.
      max_iteration_steps=TRAIN_STEPS // BOOSTING_ITERATIONS,

      # The evaluator will evaluate the model on the full training set to
      # compute the overall AdaNet loss (train loss + complexity
      # regularization) to select the best candidate to include in the
      # final AdaNet model.
      evaluator=adanet.Evaluator(
          input_fn=input_fn("train", training=False, batch_size=BATCH_SIZE)),

      # Configuration for Estimators.
      config=tf.estimator.RunConfig(
          save_checkpoints_steps=50000,
          save_summary_steps=50000,
          tf_random_seed=RANDOM_SEED))

  # Train and evaluate using using the tf.estimator tooling.
train_spec = tf.estimator.TrainSpec(
      input_fn=input_fn("train", training=True, batch_size=BATCH_SIZE),
      max_steps=TRAIN_STEPS)
eval_spec = tf.estimator.EvalSpec(
      input_fn=input_fn("test", training=False, batch_size=BATCH_SIZE),
      steps=None)
  #return tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)

To train you can just call train_and_evaluate like in the tutorial.

Predicting was trickier, for me anyway, I'm sure there are cleaner (correct) ways of doing it, this is going to be ugly, but I'm learning - here is what worked for me. To give the model new testing data you have to make a new input function, I borrowed the input_fn given in the tutorial like so:

FEATURES_KEYy = "x"

def test_input_fn(x, batch_size):
  """Generate an input function for the Estimator."""

  def _input_fn():

    dataset = tf.data.Dataset.from_tensor_slices({
          FEATURES_KEYy: x.values
      })

    dataset = dataset.batch(batch_size)
    iterator = dataset.make_one_shot_iterator()
    features = iterator.get_next()
    return features

  return _input_fn

It says x.values because I was giving it test data from a boston house price test csv from kaggle, converted into a pandas dataframe here called x_test.

predict_results = estimator.predict(input_fn=test_input_fn(x_test, len(x_test)), yield_single_examples=False)

predict_results (or whatever you call predictions) will return a generator. To see whats inside it call:

next(predict_results)) or p=next(predict_results))

This will show you the predictions by printing a dict, if you want the results as an array or series or whatever, extract the dict values:

pr=next(iter(p.values()))

To check if the predictions were sane I submitted them to the kaggle competition and they scored a 0.16 (I did engineer the training data a bit first rather than use the straight keras.dataset given in the tutorial, i.e, my x_train was (1459, 197)), not bad at all for not optimizing the adanet parameters at all, beyond the very small tweaks you can see in the above code to learning rate and lambda.

Predicting on classification is up next.

As to exporting the model, I could export the pb file only with the classification model so far and im not sure its right so I dont want to mislead anyone, I'll report when I'm sure I can get it working. GL

JimAva commented 6 years ago

Thank you for sharing. Please keep it coming. This really helps.

ghost commented 5 years ago

Sorry for a dumb question, but if it will keep me and anyone else having this issue from banging their head against the keyboard, I will ask it.

I've been testing the tutorial model for the boston house prices and am also trying to get Estimator.predict to predict. I can export the model, but calling predict always gives one of two errors.

final_prediction = adanet.Estimator.predict(input_fn, test_features, checkpoint_path=checkpoint_path) yields final_prediction as a generator

calling next(final_prediction) results in

AttributeError                            Traceback (most recent call last)
<ipython-input-124-104586511cd7> in <module>
----> 1 next(final_prediction)

c:\users\raiji\anaconda3\envs\wat\lib\site-packages\tensorflow\python\estimator\estimator.py in predict(self, input_fn, predict_keys, hooks, checkpoint_path, yield_single_examples)
    570                      'initialization to predict.'.format(self._model_dir))
    571       with ops.Graph().as_default() as g:
--> 572         random_seed.set_random_seed(self._config.tf_random_seed)
    573         self._create_and_assert_global_step(g)
    574         features, input_hooks = self._get_features_from_input_fn(

AttributeError: 'function' object has no attribute '_config'

Giving input_fn with partition=test: adanet.Estimator.predict(input_fn=input_fn, checkpoint_path=checkpoint_path) gives this error

TypeError                                 Traceback (most recent call last)
<ipython-input-118-982e61286970> in <module>
----> 1 final_prediction = adanet.Estimator.predict(input_fn=input_fn, checkpoint_path=checkpoint_path)

TypeError: predict() missing 1 required positional argument: 'self'

I get the same errors calling tf.estimator.Estimator.predict .

I assume I'm missing something trivial. Thanks for the great work!

Sorry for a dumb question. I just use : estimator.RunConfig(model_dir="./model_ada") to get the ckpt model and some events, but how to use the final ckpt model to test? where can I get the final learned overall architecture? Thank you very much!

tomalbrecht commented 5 years ago

I found some hints at: https://adanet.readthedocs.io/en/v0.5.0/adanet.html. Look at: export_saved_model(export_dir_base, serving_input_receiver_fn, assets_extra=None, as_text=False, checkpoint_path=None)

It seems similar to: https://www.tensorflow.org/api_docs/python/tf/estimator/Estimator#export_saved_model

What I did:

First I changed def train_and_evaluate, that it will return the estimator: return tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec), estimator
After this I got the estimator:

results, estimator = train_and_evaluate("uniform_average_ensemble_baseline")
results = results[0]
print(results)
print("Loss:", results["average_loss"])
print("Architecture:", ensemble_architecture(results))


And then I tried to export/save the estimator:

def serving_input_fn():
  """Input fn for serving export, starting from serialized example."""
  serialized_example = tf.placeholder(
      dtype=tf.string, shape=(None), name="serialized_example")
  return tf.estimator.export.ServingInputReceiver(
      features={"x": tf.constant([[0., 0.,0., 0.,0., 0.,0., 0.,0., 0.,0., 0.,0.]], name="serving_x")},
      receiver_tensors=serialized_example)

estimator.export_saved_model('./adamet_3', 
                             serving_input_fn, 
                             #assets_extra=None, 
                             #as_text=False, 
                             #checkpoint_path=None
                            )

It will save but I still have to change the tf.constant to a real input. I hope this will help a little bit.