tensorflow / tensor2tensor

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Apache License 2.0
15.5k stars 3.49k forks source link

No objectiveValue in the output json, after running a hyper parameters tuning job on Google's ML engine. . #1527

Open orimosenzonkami opened 5 years ago

orimosenzonkami commented 5 years ago

Description

Hi, I'm new to t2t and I'm trying to follow this poetry generation example.

In the Hyperparameter tuning stage, after running:

%%bash
DATADIR=gs://${BUCKET}/poetry/data
OUTDIR=gs://${BUCKET}/poetry/model_hparam
JOBNAME=poetry_$(date -u +%y%m%d_%H%M%S)
echo $OUTDIR $REGION $JOBNAME
gsutil -m rm -rf $OUTDIR
echo "'Y'" | t2t-trainer \
  --data_dir=gs://${BUCKET}/poetry/subset \
  --t2t_usr_dir=./poetry/trainer \
  --problem=$PROBLEM \
  --model=transformer \
  --hparams_set=transformer_poetry \
  --output_dir=$OUTDIR \
  --hparams_range=transformer_poetry_range \
  --autotune_objective='metrics-poetry_line_problem/accuracy_per_sequence' \
  --autotune_maximize \
  --autotune_max_trials=4 \
  --autotune_parallel_trials=4 \
  --train_steps=7500 --cloud_mlengine --worker_gpu=8

I get the following output json:

{
  "completedTrialCount": "4",
  "trials": [
    {
      "trialId": "1",
      "hyperparameters": {
        "hp_attention_dropout": "0.68926619291305535",
        "hp_hidden_size": "256",
        "hp_learning_rate": "0.094830311097768627",
        "hp_num_hidden_layers": "3"
      }
    },
...
  ],
  "consumedMLUnits": 59.57,
  "isHyperparameterTuningJob": true
}

Whereas, I was expecting to find an objective value of the metric I've specified. In the notebook I'm following, it states that the output json should contain something like:

{
      "trialId": "37",
      "hyperparameters": {
        "hp_num_hidden_layers": "4",
        "hp_learning_rate": "0.026711152525921437",
        "hp_hidden_size": "512",
        "hp_attention_dropout": "0.60589466163419292"
      },
      "finalMetric": {
        "trainingStep": "8000",
        "objectiveValue": 0.0276162791997
      }

What have I missed? thanks...

Environment information

google cloud's data lab (I guess it is running on a docker) 

$ pip freeze | grep tensor
mesh-tensorflow==0.0.5
tensor2tensor==1.10.0
tensorboard==1.10.0
tensorflow==1.10.0

$ python -V
Python 2.7.15 :: Anaconda, Inc.
daikeshi commented 4 years ago

I think the missing objectiveValue in the trial is 0, because objectiveValue is a float in its proto that defaults to 0 and

If a field has the default value in the protocol buffer, it will be omitted in the JSON-encoded data by default to save space.

according to https://developers.google.com/protocol-buffers/docs/proto3#json https://github.com/googleapis/go-genproto/blob/master/googleapis/cloud/ml/v1/job_service.pb.go#L818