tensorflow / tensor2tensor

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Apache License 2.0
15.34k stars 3.47k forks source link

Not the same tensorflow graph for EN-DE translation #1879

Closed sarthmit closed 3 years ago

sarthmit commented 3 years ago

Description

I am trying to run Universal Transformers for the English to German translation task. I am getting the error that the tf graphs are not the same. I searched and found a similar closed issue but the linked solution to the issue is unavailable.

$ pip freeze | grep tensor

mesh-tensorflow==0.1.18
tensor2tensor==1.15.7
tensorboard==2.4.1
tensorboard-plugin-wit==1.8.0
tensorflow==2.4.1
tensorflow-addons==0.12.1
tensorflow-datasets==4.2.0
tensorflow-estimator==2.4.0
tensorflow-gan==2.0.0
tensorflow-hub==0.11.0
tensorflow-metadata==0.27.0
tensorflow-probability==0.7.0

For bugs: reproduction and error logs

t2t-trainer \
  --data_dir=$DATA_DIR \
  --problem=translate_ende_wmt32k \
  --model=universal_transformer \
  --hparams_set=universal_transformer_base \
  --output_dir=$TRAIN_DIR

Error logs:

2021-02-17 10:36:35.448784: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
WARNING:tensorflow:From /home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow_gan/python/estimator/tpu_gan_estimator.py:42: The name tf.esti
mator.tpu.TPUEstimator is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimator instead.

WARNING:tensorflow:From /home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow_gan/python/estimator/tpu_gan_estimator.py:42: The name tf.esti
mator.tpu.TPUEstimator is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimator instead.

INFO:tensorflow:Loading hparams from existing json /home/mila/m/mittalsa/tensor2tensor/t2t_train/translate_ende_wmt32k/universal_transformer-universal_transformer_
base/hparams.json
I0217 10:36:58.888788 139921682896704 hparams_lib.py:64] Loading hparams from existing json /home/mila/m/mittalsa/tensor2tensor/t2t_train/translate_ende_wmt32k/uni
versal_transformer-universal_transformer_base/hparams.json
INFO:tensorflow:Configuring DataParallelism to replicate the model.
I0217 10:36:58.893124 139921682896704 trainer_lib.py:271] Configuring DataParallelism to replicate the model.
INFO:tensorflow:schedule=continuous_train_and_eval
I0217 10:36:58.893256 139921682896704 devices.py:76] schedule=continuous_train_and_eval
INFO:tensorflow:worker_gpu=1
I0217 10:36:58.893335 139921682896704 devices.py:77] worker_gpu=1
INFO:tensorflow:sync=False
I0217 10:36:58.893400 139921682896704 devices.py:78] sync=False
WARNING:tensorflow:Schedule=continuous_train_and_eval. Assuming that training is running on a single machine.
W0217 10:36:58.893464 139921682896704 devices.py:141] Schedule=continuous_train_and_eval. Assuming that training is running on a single machine.
INFO:tensorflow:datashard_devices: ['gpu:0']
I0217 10:36:58.894149 139921682896704 devices.py:170] datashard_devices: ['gpu:0']
INFO:tensorflow:caching_devices: None
I0217 10:36:58.894598 139921682896704 devices.py:171] caching_devices: None
INFO:tensorflow:ps_devices: ['gpu:0']
I0217 10:36:58.894993 139921682896704 devices.py:172] ps_devices: ['gpu:0']
INFO:tensorflow:Using config: {'_model_dir': '/home/mila/m/mittalsa/tensor2tensor/t2t_train/translate_ende_wmt32k/universal_transformer-universal_transformer_base', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 1000, '_save_checkpoints_secs': None, '_session_config': gpu_options {
  per_process_gpu_memory_fraction: 0.95   
}
}
allow_soft_placement: true
graph_options {
  optimizer_options {
    global_jit_level: OFF
  }
}
isolate_session_state: true
, '_keep_checkpoint_max': 20, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1, 'use_tpu': False, 't2t_device_info': {'num_async_replicas': 1}, 'data_parallelism': <tensor2tensor.utils.expert_utils.Parallelism object at 0x7f41a3b9b978>}
I0217 10:36:59.021266 139921682896704 estimator.py:191] Using config: {'_model_dir': '/home/mila/m/mittalsa/tensor2tensor/t2t_train/translate_ende_wmt32k/universal_transformer-universal_transformer_base', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 1000, '_save_checkpoints_secs': None, '_session_config': gpu_options {
  per_process_gpu_memory_fraction: 0.95   
}
allow_soft_placement: true
graph_options {
  optimizer_options {
    global_jit_level: OFF
  }
}
isolate_session_state: true
, '_keep_checkpoint_max': 20, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1, 'use_tpu': False, 't2t_device_info': {'num_async_replicas': 1}, 'data_parallelism': <tensor2tensor.utils.expert_utils.Parallelism object at 0x7f41a3b9b978>}
WARNING:tensorflow:Estimator's model_fn (<function T2TModel.make_estimator_model_fn.<locals>.wrapping_model_fn at 0x7f41a3da70d0>) includes params argument, but params are not passed to Estimator.
W0217 10:36:59.021582 139921682896704 model_fn.py:629] Estimator's model_fn (<function T2TModel.make_estimator_model_fn.<locals>.wrapping_model_fn at 0x7f41a3da70d0>) includes params argument, but params are not passed to Estimator.
WARNING:tensorflow:ValidationMonitor only works with --schedule=train_and_evaluate
W0217 10:36:59.021708 139921682896704 trainer_lib.py:795] ValidationMonitor only works with --schedule=train_and_evaluate
INFO:tensorflow:Not using Distribute Coordinator.
I0217 10:36:59.036123 139921682896704 estimator_training.py:186] Not using Distribute Coordinator.
INFO:tensorflow:Running training and evaluation locally (non-distributed).
I0217 10:36:59.036407 139921682896704 training.py:645] Running training and evaluation locally (non-distributed).
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps 1000 or save_checkpoints_secs None.
I0217 10:36:59.036719 139921682896704 training.py:733] Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps 1000 or save_checkpoints_secs None.
WARNING:tensorflow:From /home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
W0217 10:36:59.066916 139921682896704 deprecation.py:339] From /home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
2021-02-17 10:36:59.084372: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-02-17 10:36:59.089130: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2021-02-17 10:36:59.164164: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:b1:00.0 name: Quadro RTX 8000 computeCapability: 7.5
coreClock: 1.77GHz coreCount: 72 deviceMemorySize: 47.46GiB deviceMemoryBandwidth: 625.94GiB/s
2021-02-17 10:36:59.164344: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2021-02-17 10:36:59.172861: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2021-02-17 10:36:59.173078: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2021-02-17 10:36:59.178167: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-02-17 10:36:59.181463: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-02-17 10:36:59.188911: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2021-02-17 10:36:59.192324: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2021-02-17 10:36:59.194594: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2021-02-17 10:36:59.203122: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
INFO:tensorflow:Reading data files from /home/mila/m/mittalsa/tensor2tensor/t2t_data/translate_ende_wmt32k-train*
I0217 10:36:59.222244 139921682896704 problem.py:653] Reading data files from /home/mila/m/mittalsa/tensor2tensor/t2t_data/translate_ende_wmt32k-train*
INFO:tensorflow:partition: 0 num_data_files: 100
I0217 10:36:59.262548 139921682896704 problem.py:679] partition: 0 num_data_files: 100
WARNING:tensorflow:From /home/mila/m/mittalsa/tensor2tensor/tensor2tensor/data_generators/problem.py:689: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_deterministic`.
W0217 10:36:59.265164 139921682896704 deprecation.py:339] From /home/mila/m/mittalsa/tensor2tensor/tensor2tensor/data_generators/problem.py:689: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_deterministic`.
WARNING:tensorflow:From /home/mila/m/mittalsa/tensor2tensor/tensor2tensor/utils/data_reader.py:276: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and:
`tf.data.TFRecordDataset(path)`
W0217 10:36:59.347080 139921682896704 deprecation.py:339] From /home/mila/m/mittalsa/tensor2tensor/tensor2tensor/utils/data_reader.py:276: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and:
`tf.data.TFRecordDataset(path)`
WARNING:tensorflow:From /home/mila/m/mittalsa/tensor2tensor/tensor2tensor/utils/data_reader.py:38: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
W0217 10:36:59.483155 139921682896704 deprecation.py:339] From /home/mila/m/mittalsa/tensor2tensor/tensor2tensor/utils/data_reader.py:38: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
W0217 10:36:59.483155 139921682896704 deprecation.py:339] From /home/mila/m/mittalsa/tensor2tensor/tensor2tensor/utils/data_reader.py:38: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
WARNING:tensorflow:From /home/mila/m/mittalsa/tensor2tensor/tensor2tensor/utils/data_reader.py:234: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
W0217 10:37:00.108635 139921682896704 deprecation.py:339] From /home/mila/m/mittalsa/tensor2tensor/tensor2tensor/utils/data_reader.py:234: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
INFO:tensorflow:Calling model_fn.
I0217 10:37:00.293850 139921682896704 estimator.py:1162] Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'train'
I0217 10:37:00.320769 139921682896704 t2t_model.py:2267] Setting T2TModel mode to 'train'
INFO:tensorflow:Using variable initializer: uniform_unit_scaling
I0217 10:37:01.047983 139921682896704 api.py:479] Using variable initializer: uniform_unit_scaling
INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_33510_1024.bottom
I0217 10:37:07.056139 139921682896704 api.py:479] Transforming feature 'inputs' with symbol_modality_33510_1024.bottom
INFO:tensorflow:Transforming feature 'targets' with symbol_modality_33510_1024.targets_bottom
I0217 10:37:08.512664 139921682896704 api.py:479] Transforming feature 'targets' with symbol_modality_33510_1024.targets_bottom
INFO:tensorflow:Building model body
I0217 10:37:08.603129 139921682896704 api.py:479] Building model body
WARNING:tensorflow:From /home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py:201: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
W0217 10:37:09.968957 139921682896704 deprecation.py:537] From /home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py:201: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
I0217 10:37:11.088009 139921682896704 common_layers.py:51] Running in V2 mode, using Keras layers.
I0217 10:37:11.111042 139921682896704 common_layers.py:51] Running in V2 mode, using Keras layers.
I0217 10:37:11.127041 139921682896704 common_layers.py:51] Running in V2 mode, using Keras layers.
I0217 10:37:11.190722 139921682896704 common_layers.py:51] Running in V2 mode, using Keras layers.
I0217 10:37:11.235233 139921682896704 common_layers.py:51] Running in V2 mode, using Keras layers.
I0217 10:37:11.259665 139921682896704 common_layers.py:51] Running in V2 mode, using Keras layers.
I0217 10:37:13.880186 139921682896704 common_layers.py:51] Running in V2 mode, using Keras layers.
I0217 10:37:13.896136 139921682896704 common_layers.py:51] Running in V2 mode, using Keras layers.
I0217 10:37:13.911811 139921682896704 common_layers.py:51] Running in V2 mode, using Keras layers.
I0217 10:37:13.972910 139921682896704 common_layers.py:51] Running in V2 mode, using Keras layers.
I0217 10:37:14.009448 139921682896704 common_layers.py:51] Running in V2 mode, using Keras layers.
I0217 10:37:14.025386 139921682896704 common_layers.py:51] Running in V2 mode, using Keras layers.
I0217 10:37:14.041645 139921682896704 common_layers.py:51] Running in V2 mode, using Keras layers.
I0217 10:37:14.103353 139921682896704 common_layers.py:51] Running in V2 mode, using Keras layers.
I0217 10:37:14.140556 139921682896704 common_layers.py:51] Running in V2 mode, using Keras layers.
I0217 10:37:14.165198 139921682896704 common_layers.py:51] Running in V2 mode, using Keras layers.
INFO:tensorflow:Transforming body output with symbol_modality_33510_1024.top
I0217 10:37:16.450344 139921682896704 api.py:479] Transforming body output with symbol_modality_33510_1024.top
INFO:tensorflow:Base learning rate: 2.000000
I0217 10:37:20.661118 139921682896704 learning_rate.py:29] Base learning rate: 2.000000
INFO:tensorflow:Trainable Variables Total size: 63744000
I0217 10:37:20.667574 139921682896704 optimize.py:355] Trainable Variables Total size: 63744000
INFO:tensorflow:Non-trainable variables Total size: 5
I0217 10:37:20.669589 139921682896704 optimize.py:355] Non-trainable variables Total size: 5
INFO:tensorflow:Using optimizer adam
I0217 10:37:20.669764 139921682896704 optimize.py:200] Using optimizer adam
INFO:tensorflow:Done calling model_fn.
I0217 10:37:22.981385 139921682896704 estimator.py:1164] Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
I0217 10:37:22.982550 139921682896704 basic_session_run_hooks.py:546] Create CheckpointSaverHook.
Traceback (most recent call last):
  File "/home/mila/m/mittalsa/.conda/envs/t2t/bin/t2t-trainer", line 33, in <module>
    tf.app.run(main)
  File "/home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/absl/app.py", line 300, in run
    _run_main(main, args)
  File "/home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/home/mila/m/mittalsa/.conda/envs/t2t/bin/t2t-trainer", line 28, in main
    t2t_trainer.main(argv)
  File "/home/mila/m/mittalsa/tensor2tensor/tensor2tensor/bin/t2t_trainer.py", line 418, in main
    execute_schedule(exp)
  File "/home/mila/m/mittalsa/tensor2tensor/tensor2tensor/bin/t2t_trainer.py", line 371, in execute_schedule
    getattr(exp, FLAGS.schedule)()
  File "/home/mila/m/mittalsa/tensor2tensor/tensor2tensor/utils/trainer_lib.py", line 468, in continuous_train_and_eval
    self._eval_spec)
  File "/home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/training.py", line 505, in train_and_evaluate
    return executor.run()
  File "/home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/training.py", line 646, in run
    return self.run_local()
  File "/home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/training.py", line 747, in run_local
    saving_listeners=saving_listeners)
  File "/home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 349, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1175, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "/home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1208, in _train_model_default
    saving_listeners)
  File "/home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1510, in _train_with_estimator_spec
    save_graph_def=self._config.checkpoint_save_graph_def) as mon_sess:
  File "/home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 604, in MonitoredTrainingSession
    stop_grace_period_secs=stop_grace_period_secs)
  File "/home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1038, in __init__
    stop_grace_period_secs=stop_grace_period_secs)
  File "/home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 749, in __init__
    self._sess = _RecoverableSession(self._coordinated_creator)
  File "/home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1231, in __init__
    _WrappedSession.__init__(self, self._create_session())
  File "/home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1236, in _create_session
    return self._sess_creator.create_session()
  File "/home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 902, in create_session
    self.tf_sess = self._session_creator.create_session()
  File "/home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 660, in create_session
    self._scaffold.finalize()
  File "/home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 232, in finalize
    summary.merge_all)
  File "/home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 297, in get_or_default
    op = default_constructor()
  File "/home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow/python/summary/summary.py", line 406, in merge_all
    return merge(summary_ops, name=name)  
  File "/home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow/python/summary/summary.py", line 370, in merge
    with _ops.name_scope(name, 'Merge', inputs):
  File "/home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 6495, in __enter__
    g_from_inputs = _get_graph_from_inputs(self._values)
  File "/home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 6130, in _get_graph_from_inputs
    _assert_same_graph(original_graph_element, graph_element)
  File "/home/mila/m/mittalsa/.conda/envs/t2t/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 6065, in _assert_same_graph
    (item, original_item, graph, original_graph))
ValueError: Tensor("rec_layer_0/self_attention/multihead_attention/dot_product_attention/attention:0", shape=(), dtype=string, device=/device:GPU:0) must be from the same graph as Tensor("universal_transformer_hparams:0", shape=(), dtype=string) (graphs are FuncGraph(name=universal_transformer_parallel_0_5_universal_transformer_universal_transformer_body_encoder_universal_transformer_basic_foldl_while_body_988_rewritten, id=139918132471568) and <tensorflow.python.framework.ops.Graph object at 0x7f41a3d894e0>).
sarthmit commented 3 years ago

Solved in #1849