tensorflow / tensor2tensor

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Apache License 2.0
15.11k stars 3.44k forks source link

AssertionError: The rollout of ppo_hparams.epoch_length will be distributed amongsteffective_num_agents of agents #1891

Open zhuliwen opened 3 years ago

zhuliwen commented 3 years ago

Description

When I set the number of agents to 256, I got an error. But setting the number of agents to a smaller number will work. How do I solve this problem, and what is the maximum number of agents?

My run command:

python -m tensor2tensor.rl.trainer_model_based \
  --loop_hparams_set=rlmb_base_stochastic_discrete \
  --loop_hparams=game=pong \
  --loop_hparams=real_ppo_effective_num_agents=256 \
  --output_dir ~/t2t_train/mb_sd_pong_256agents
# Error logs:
tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensor2tensor/data_generators/gym_env.py:541: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`
INFO:tensorflow:Initial training of the policy in real environment.
INFO:tensorflow:Initial training of the policy in real environment.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
INFO:tensorflow:Applying wrapper <class 'tensor2tensor.rl.envs.tf_atari_wrappers.StackWrapper'>({'history': 4}) to env PyFuncEnv(T2TGymEnv(<TimeLimitMaxAndSkip<<AtariEnv<PongNoFrameskip-v4>>>>)).
INFO:tensorflow:Applying wrapper <class 'tensor2tensor.rl.envs.tf_atari_wrappers.StackWrapper'>({'history': 4}) to env PyFuncEnv(T2TGymEnv(<TimeLimitMaxAndSkip<<AtariEnv<PongNoFrameskip-v4>>>>)).
INFO:tensorflow:Applying wrapper <class 'tensor2tensor.rl.ppo_learner._MemoryWrapper'>({}) to env StackWrapper(PyFuncEnv(T2TGymEnv(<TimeLimitMaxAndSkip<<AtariEnv<PongNoFrameskip-v4>>>>))).
INFO:tensorflow:Applying wrapper <class 'tensor2tensor.rl.ppo_learner._MemoryWrapper'>({}) to env StackWrapper(PyFuncEnv(T2TGymEnv(<TimeLimitMaxAndSkip<<AtariEnv<PongNoFrameskip-v4>>>>))).
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensor2tensor/rl/envs/py_func_batch_env.py:122: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version.
Instructions for updating:
tf.py_func is deprecated in TF V2. Instead, use
    tf.py_function, which takes a python function which manipulates tf eager
    tensors instead of numpy arrays. It's easy to convert a tf eager tensor to
    an ndarray (just call tensor.numpy()) but having access to eager tensors
    means `tf.py_function`s can use accelerators such as GPUs as well as
    being differentiable using a gradient tape.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensor2tensor/rl/envs/py_func_batch_env.py:122: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version.
Instructions for updating:
tf.py_func is deprecated in TF V2. Instead, use
    tf.py_function, which takes a python function which manipulates tf eager
    tensors instead of numpy arrays. It's easy to convert a tf eager tensor to
    an ndarray (just call tensor.numpy()) but having access to eager tensors
    means `tf.py_function`s can use accelerators such as GPUs as well as
    being differentiable using a gradient tape.

INFO:tensorflow:Using DummyPolicyProblem for the policy.
INFO:tensorflow:Using DummyPolicyProblem for the policy.
INFO:tensorflow:Setting T2TModel mode to 'train'
INFO:tensorflow:Setting T2TModel mode to 'train'
INFO:tensorflow:Using variable initializer: orthogonal
INFO:tensorflow:Using variable initializer: orthogonal
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensor2tensor/utils/t2t_model.py:1358: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensor2tensor/utils/t2t_model.py:1358: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
INFO:tensorflow:Transforming feature 'input_action' with symbol_modality_6_64.bottom
INFO:tensorflow:Transforming feature 'input_action' with symbol_modality_6_64.bottom
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/function.py:1007: calling Graph.create_op (from tensorflow.python.framework.ops) with compute_shapes is deprecated and will be removed in a future version.
Instructions for updating:
Shapes are always computed; don't use the compute_shapes as it has no effect.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/function.py:1007: calling Graph.create_op (from tensorflow.python.framework.ops) with compute_shapes is deprecated and will be removed in a future version.
Instructions for updating:
Shapes are always computed; don't use the compute_shapes as it has no effect.
INFO:tensorflow:Transforming feature 'input_reward' with symbol_modality_3_64.bottom
INFO:tensorflow:Transforming feature 'input_reward' with symbol_modality_3_64.bottom
INFO:tensorflow:Transforming feature 'inputs' with video_modality.bottom
INFO:tensorflow:Transforming feature 'inputs' with video_modality.bottom
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensor2tensor/layers/common_layers.py:277: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensor2tensor/layers/common_layers.py:277: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
INFO:tensorflow:Transforming feature 'target_action' with symbol_modality_6_64.targets_bottom
INFO:tensorflow:Transforming feature 'target_action' with symbol_modality_6_64.targets_bottom
INFO:tensorflow:Transforming feature 'target_policy' with identity_modality.targets_bottom
INFO:tensorflow:Transforming feature 'target_policy' with identity_modality.targets_bottom
INFO:tensorflow:Transforming feature 'target_reward' with symbol_modality_3_64.targets_bottom
INFO:tensorflow:Transforming feature 'target_reward' with symbol_modality_3_64.targets_bottom
INFO:tensorflow:Transforming feature 'target_value' with identity_modality.targets_bottom
INFO:tensorflow:Transforming feature 'target_value' with identity_modality.targets_bottom
INFO:tensorflow:Transforming feature 'targets' with video_modality.targets_bottom
INFO:tensorflow:Transforming feature 'targets' with video_modality.targets_bottom
INFO:tensorflow:Building model body
INFO:tensorflow:Building model body
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensor2tensor/models/research/rl.py:598: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.conv2d instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensor2tensor/models/research/rl.py:598: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.conv2d instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensor2tensor/models/research/rl.py:602: flatten (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.flatten instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensor2tensor/models/research/rl.py:602: flatten (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.flatten instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensor2tensor/models/research/rl.py:603: dropout (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.dropout instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensor2tensor/models/research/rl.py:603: dropout (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.dropout instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensor2tensor/models/research/rl.py:604: dense (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.dense instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensor2tensor/models/research/rl.py:604: dense (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.dense instead.
INFO:tensorflow:Transforming body output with identity_modality.top
INFO:tensorflow:Transforming body output with identity_modality.top
INFO:tensorflow:Transforming body output with identity_modality.top
INFO:tensorflow:Transforming body output with identity_modality.top
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensor2tensor/layers/common_layers.py:2887: multinomial (from tensorflow.python.ops.random_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.random.categorical instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensor2tensor/layers/common_layers.py:2887: multinomial (from tensorflow.python.ops.random_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.random.categorical instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensor2tensor/rl/ppo_learner.py:479: Print (from tensorflow.python.ops.logging_ops) is deprecated and will be removed after 2018-08-20.
Instructions for updating:
Use tf.print instead of tf.Print. Note that tf.print returns a no-output operator that directly prints the output. Outside of defuns or eager mode, this operator will not be executed unless it is directly specified in session.run or used as a control dependency for other operators. This is only a concern in graph mode. Below is an example of how to ensure tf.print executes in graph mode:
```python
    sess = tf.Session()
    with sess.as_default():
        tensor = tf.range(10)
        print_op = tf.print(tensor)
        with tf.control_dependencies([print_op]):
          out = tf.add(tensor, tensor)
        sess.run(out)

Additionally, to use tf.print in python 2.7, users must make sure to import the following:

from __future__ import print_function

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensor2tensor/rl/ppo_learner.py:479: Print (from tensorflow.python.ops.logging_ops) is deprecated and will be removed after 2018-08-20. Instructions for updating: Use tf.print instead of tf.Print. Note that tf.print returns a no-output operator that directly prints the output. Outside of defuns or eager mode, this operator will not be executed unless it is directly specified in session.run or used as a control dependency for other operators. This is only a concern in graph mode. Below is an example of how to ensure tf.print executes in graph mode:

    sess = tf.Session()
    with sess.as_default():
        tensor = tf.range(10)
        print_op = tf.print(tensor)
        with tf.control_dependencies([print_op]):
          out = tf.add(tensor, tensor)
        sess.run(out)

Additionally, to use tf.print in python 2.7, users must make sure to import the following:

from __future__ import print_function

Traceback (most recent call last): File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/usr/local/lib/python3.6/dist-packages/tensor2tensor/rl/trainer_model_based.py", line 387, in tf.app.run() File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/platform/app.py", line 125, in run _sys.exit(main(argv)) File "/usr/local/lib/python3.6/dist-packages/tensor2tensor/rl/trainer_model_based.py", line 382, in main training_loop(hp, FLAGS.output_dir) File "/usr/local/lib/python3.6/dist-packages/tensor2tensor/rl/trainer_model_based.py", line 291, in training_loop train_agent_real_env(env, learner, hparams, epoch) File "/usr/local/lib/python3.6/dist-packages/tensor2tensor/rl/trainer_model_based.py", line 192, in train_agent_real_env num_env_steps=num_env_steps, File "/usr/local/lib/python3.6/dist-packages/tensor2tensor/rl/ppo_learner.py", line 85, in train force_beginning_resets=simulated)) File "/usr/local/lib/python3.6/dist-packages/tensor2tensor/rl/ppo_learner.py", line 166, in _define_train **collect_kwargs)) File "/usr/local/lib/python3.6/dist-packages/tensor2tensor/rl/ppo_learner.py", line 489, in _define_collect "The rollout of ppo_hparams.epoch_length will be distributed amongst" AssertionError: The rollout of ppo_hparams.epoch_length will be distributed amongsteffective_num_agents of agents