GoogleCloudPlatform / cloudml-samples

Cloud ML Engine repo. Please visit the new Vertex AI samples repo at https://github.com/GoogleCloudPlatform/vertex-ai-samples
https://cloud.google.com/ai-platform/docs/
Apache License 2.0
1.52k stars 857 forks source link

Question to the keras example for HP Tuning #429

Closed datistiquo closed 5 years ago

datistiquo commented 5 years ago

The keras example is under the part for HP tuning, but I cannot see anything realted to that. https://github.com/GoogleCloudPlatform/cloudml-samples/tree/master/census/keras

I started freshly with the GCP! Don't you need an estimator for HP Tuning? In the above example where is the estimator created?

It is really hard working with GCP at the beginning. I struggle to use my keras model for HP tuning on GCP. Is there any example out there?

I followed as an intro https://cloud.google.com/ml-engine/docs/tensorflow/getting-started-keras?hl=de#visualize_training_and_export_the_trained_model

But do you need to save the keras model to a SavedModel for HP Tuning?

nnegrey commented 5 years ago

Hi thanks for trying out GCP.

@andrewferlitsch, mind helping them get started with Keras on GCP?

andrewferlitsch commented 5 years ago

@datistiquo

  1. The estimator model is not required for HP Tuning.

  2. The SavedModel format is not required for HP Tuning.

  3. To do HP Tuning, you need to include the hptuning_config.yaml file as explained in the tutorial:

gcloud ai-platform jobs submit training $JOB_NAME-hpt \ --config hptuning_config.yaml \ --package-path trainer/ \ --module-name trainer.task \ --region $REGION \ --python-version 3.5 \ --runtime-version 1.13 \ --job-dir $JOB_DIR \ --stream-logs

datistiquo commented 5 years ago

@andrewferlitsch As you can read here: https://cloud.google.com/blog/products/gcp/hyperparameter-tuning-on-google-cloud-platform-is-now-faster-and-smarter

estimators are preferred for this task?

If I have a plain keras model and save it as SavedModel and using my hyperparameters with argparse could I do submitting the HP Tuning job as above too? How would I do that. The doc is about estimators and tensorflow core... https://cloud.google.com/ml-engine/docs/tensorflow/using-hyperparameter-tuning

This doc says contrary to the first link in this post that you need to write out summary for all hyperparameter? Whereas in the first one you get the impression you need this just for the evaluation metric?

Edit: The last part was due to reading the tutorial in german (accidentally)... in english is more clear because hyperparameter metric is translated with "Hyperparamter-Messwert" which actually reads as hyperparameter value...

Fügen Sie Ihren Hyperparameter-Messwert zur Zusammenfassung für Ihre Grafik hinzu

vs

Add your hyperparameter metric to the summary for your graph.

gogasca commented 5 years ago

You can use the tf-keras example to do hyperparameter tuning, I added an example recently. I used the callback to be able to write the statistics. (it calls tf.summary.FileWriter.add_summary under-the-hood, which CMLE supports natively)

Please take a look at the documentation for HP tuning:

An example of yaml file will be as follows, this will be using GRID_SEARCH:

trainingInput:
  hyperparameters:
    algorithm: GRID_SEARCH
    goal: MAXIMIZE
    maxTrials: 4
    maxParallelTrials: 2
    hyperparameterMetricTag: epoch_acc
    params:
    - parameterName: batch-size
      type: INTEGER
      minValue: 8
      maxValue: 256
      scaleType: UNIT_LINEAR_SCALE
flooreigenhuis commented 5 years ago

@gogasca I followed your example exactly, but I'm getting the error AttributeError: module 'tensorflow._api.v2.train' has no attribute 'SummaryWriter' and now I've been stuck for a while. It's weird because when I start a training job Tensorboard does just fine. I'm using Tensorflow 2.0.0-rc0. Am I missing something?

gogasca commented 5 years ago

@flooreigenhuis looks like that method do not exists in TF 2.0, let me run it and provide and update here

flooreigenhuis commented 5 years ago

@gogasca have you had time to look at it yet or do you have any pointers for me? :)

gogasca commented 5 years ago

You may want to check the SummaryWriter syntax in 2.0 https://github.com/tensorflow/tensorflow/issues/25356