model.eval() : 'list' object not callable

mmcs-work commented 4 years ago

I am trying to train the T5 model from scratch using the IMDB dataset. Instead of using the TensorFlow datasets I am using the raw dataset and preprocessing it accordingly. With the same dataset when I train the model it works correctly. (Loss curve is reasonable). But when I ran model.eval() on the test set or even the train set (the same data with which model is trained), it gets this error.

INFO:tensorflow:Using config: {'_model_dir': 'gs://fiery-lcm-000001/models1/small', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
cluster_def {
  job {
    name: "worker"
    tasks {
      key: 0
      value: "10.14.56.218:8470"
    }
  }
}
isolate_session_state: true
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': ClusterSpec({'worker': ['10.14.56.218:8470']}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': 'grpc://10.14.56.218:8470', '_evaluation_master': 'grpc://10.14.56.218:8470', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1, '_tpu_config': TPUConfig(iterations_per_loop=100, num_shards=None, num_cores_per_replica=1, per_host_input_for_training=4, tpu_job_name=None, initial_infeed_sleep_secs=None, input_partition_dims=None, eval_training_input_configuration=2, experimental_host_call_every_n_steps=1), '_cluster': <tensorflow.python.distribute.cluster_resolver.tpu_cluster_resolver.TPUClusterResolver object at 0x7f7a17bb1240>}
INFO:tensorflow:_TPUContext: eval_on_tpu True
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-13-4b2a7485e77f> in <module>()
      4     mixture_or_task_name="trivia_all",
      5     checkpoint_steps="all",
----> 6     split="test"
      7 )

2 frames
/usr/local/lib/python3.6/dist-packages/t5/models/mtf_model.py in eval(self, mixture_or_task_name, checkpoint_steps, summary_dir, split)
    265     utils.eval_model(self.estimator(vocabulary), vocabulary,
    266                      self._sequence_length, self.batch_size, split,
--> 267                      self._model_dir, dataset_fn, summary_dir, checkpoint_steps)
    268 
    269   def finetune(self, mixture_or_task_name, finetune_steps, pretrained_model_dir,

/usr/local/lib/python3.6/dist-packages/mesh_tensorflow/transformer/utils.py in eval_model(estimator, vocabulary, sequence_length, batch_size, dataset_split, model_dir, eval_dataset_fn, eval_summary_dir, eval_checkpoint_step)
   1261                 tf.compat.as_text(ex["targets_plaintext"]),
   1262                 example=ex, is_target=True)
-> 1263             for ex in examples
   1264         ]
   1265         targets_filename = os.path.join(

/usr/local/lib/python3.6/dist-packages/mesh_tensorflow/transformer/utils.py in <listcomp>(.0)
   1261                 tf.compat.as_text(ex["targets_plaintext"]),
   1262                 example=ex, is_target=True)
-> 1263             for ex in examples
   1264         ]
   1265         targets_filename = os.path.join(

TypeError: 'list' object is not callable

If there is any pre-processed data, shouldn't it affect the training process as well?
If anybody has any idea regarding this issue, please let me know.

Here is the notebook link.

Dataset link
In this case I have uploaded the zip file when colab runtime was connected. And then worked on it and copied it back to GCS cloud storage.

adarob commented 4 years ago

The issue is that you have the postprocess_fn in a list. It should just be a single function.

mmcs-work commented 4 years ago

The issue is that you have the postprocess_fn in a list. It should just be a single function.

Thanks for the reply.

I am surprised that it didn't create any problem in the training phase. (Because post-process function is applied in that case as well). Or am I missing something?
Also so in case there are multiple post-process functions one wants to apply, they have to create a wrapper function that would do all those works under the hood and just pass this single wrapper function in postprocess_fn?

adarob commented 4 years ago

Post-process is only applied during evaluation actually.
That's correct, although we'd happily accept a PR to change the behavior to be more like the preprocessors!

google-research / text-to-text-transfer-transformer

model.eval() : 'list' object not callable #243