HorizonRobotics / alf

Agent Learning Framework https://alf.readthedocs.io
Apache License 2.0
295 stars 49 forks source link

async train sometime fails #316

Open witwolf opened 4 years ago

witwolf commented 4 years ago

when test with ppo_async_icm_super_mario_intrinsic_only

rm -rf tmp && python3 -m alf.bin.train \
 --root_dir=tmp \
 --gin_file=ppo_async_icm_super_mario_intrinsic_only.gin \
 --gin_param=TrainerConfig.random_seed=0 \
 --gin_param=create_environment.num_parallel_environments=1 \
 --gin_param=TrainerConfig.num_iterations=2 \
 --gin_param=TrainerConfig.num_steps_per_iter=1 \
 --gin_param=TrainerConfig.num_updates_per_train_step=1 \
 --gin_param=TrainerConfig.mini_batch_length=2 \
 --gin_param=TrainerConfig.mini_batch_size=4 \
 --gin_param=TrainerConfig.num_envs=2 \
 --gin_param=ReplayBuffer.max_length=64 \
 --gin_param=TrainerConfig.unroll_length=2 \
 --gin_param=TrainerConfig.num_updates_per_train_step=2 \
 --gin_param=TrainerConfig.use_tf_functions=False

get error msg:

  ...
  File "/home/hongyingxiang/FLA/alf/drivers/threads.py", line 410, in _step
    self._env.step(action), action, first_env_id=self._first_env_id)
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/tf_environment.py", line 232, in step
    return self._step(action)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/autograph/impl/api.py", line 292, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/tf_py_environment.py", line 319, in _step
    name='step_py_func')
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/script_ops.py", line 591, in numpy_function
    return py_func_common(func, inp, Tout, stateful=True, name=name)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/script_ops.py", line 488, in py_func_common
    result = func(*[x.numpy() for x in inp])
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/tf_py_environment.py", line 302, in _isolated_step_py
    return self._execute(_step_py, *flattened_actions)
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/tf_py_environment.py", line 195, in _execute
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/tf_py_environment.py", line 298, in _step_py
    self._time_step = self._env.step(packed)
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/py_environment.py", line 174, in step
    self._current_time_step = self._step(action)
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/parallel_py_environment.py", line 136, in _step
    time_steps = [promise() for promise in time_steps]
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/parallel_py_environment.py", line 136, in <listcomp>
    time_steps = [promise() for promise in time_steps]
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/parallel_py_environment.py", line 338, in _receive
    raise Exception(stacktrace)
Exception: Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/parallel_py_environment.py", line 377, in _worker
    result = getattr(env, name)(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/py_environment.py", line 174, in step
    self._current_time_step = self._step(action)
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/wrappers.py", line 105, in _step
    time_step = self._env.step(action)
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/py_environment.py", line 174, in step
    self._current_time_step = self._step(action)
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/gym_wrapper.py", line 197, in _step
    observation, reward, self._done, self._info = self._gym_env.step(action)
  File "/usr/local/lib/python3.6/dist-packages/gym/core.py", line 282, in step
    return self.env.step(self.action(action))
  File "/home/hongyingxiang/FLA/alf/environments/mario_wrappers.py", line 121, in action
    for i in self._actions[a]:
IndexError: list index out of range

and with diff

diff --git a/alf/algorithms/actor_critic_algorithm.py b/alf/algorithms/actor_critic_algorithm.py
index 20216fa..836e9dc 100644
--- a/alf/algorithms/actor_critic_algorithm.py
+++ b/alf/algorithms/actor_critic_algorithm.py
@@ -110,6 +110,8 @@ class ActorCriticAlgorithm(OnPolicyAlgorithm):
             step_type=time_step.step_type,
             network_state=state.actor)

+        import threading
+        print(action_distribution.logits[0][:4], threading.current_thread().ident)
         action = common.sample_action_distribution(action_distribution)
         return PolicyStep(
             action=action,

i get

  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/py_environment.py", line 174, in step
    self._current_time_step = self._step(action)
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/gym_wrapper.py", line 197, in _step
    observation, reward, self._done, self._info = self._gym_env.step(action)
  File "/usr/local/lib/python3.6/dist-packages/gym/core.py", line 282, in step
    return self.env.step(self.action(action))
  File "/home/hongyingxiang/FLA/alf/environments/mario_wrappers.py", line 121, in action
    for i in self._actions[a]:
IndexError: list index out of range

tf.Tensor([nan nan nan nan], shape=(4,), dtype=float32) 140210483619648
tf.Tensor([nan nan nan nan], shape=(4,), dtype=float32) 140210483619648
tf.Tensor([nan nan nan nan], shape=(4,), dtype=float32) 140210483619648

logits for distributions.Categorical might be nan (it's very easy to reproduce this issue)

can you help take a look for this issue @emailweixu @hnyu

witwolf commented 4 years ago

with diff

diff --git a/alf/algorithms/actor_critic_algorithm.py b/alf/algorithms/actor_critic_algorithm.py
index 20216fa..836e9dc 100644
--- a/alf/algorithms/actor_critic_algorithm.py
+++ b/alf/algorithms/actor_critic_algorithm.py
@@ -110,6 +110,8 @@ class ActorCriticAlgorithm(OnPolicyAlgorithm):
             step_type=time_step.step_type,
             network_state=state.actor)

+        import threading
+        print(action_distribution.logits[0][:4], threading.current_thread().ident)
         action = common.sample_action_distribution(action_distribution)
         return PolicyStep(
             action=action,
diff --git a/alf/algorithms/algorithm.py b/alf/algorithms/algorithm.py
index 0a09c34..3d38841 100644
--- a/alf/algorithms/algorithm.py
+++ b/alf/algorithms/algorithm.py
@@ -444,7 +444,7 @@ class Algorithm(tf.Module):
                     grads_and_vars = eager_utils.clip_gradient_norms(
                         grads_and_vars, self._gradient_clipping)

-            optimizer.apply_gradients(grads_and_vars)
+            #optimizer.apply_gradients(grads_and_vars)

         self.after_train(training_info)

it can run success, but the shows network may output all zero values (4th line below)

tf.Tensor([ 4.0710214e-07 -4.2188938e-07  1.1379381e-06  1.0288916e-07], shape=(4,), dtype=float32) 140648371398400
tf.Tensor([ 3.7338214e-07 -4.2484137e-07  1.2331875e-06  5.9219481e-08], shape=(4,), dtype=float32) 140648379791104
tf.Tensor([ 3.5209925e-07 -4.5102286e-07  1.2447317e-06  4.7751627e-08], shape=(4,), dtype=float32) 140648371398400
tf.Tensor([0. 0. 0. 0.], shape=(4,), dtype=float32) 140651241396032
tf.Tensor([ 4.0500336e-07 -4.9662350e-07  1.1527733e-06  1.9858007e-07], shape=(4,), dtype=float32) 140651241396032

so this problem is related to async rollout and optimize

witwolf commented 4 years ago

i also add a test that same as parallel env rollout

import tensorflow as tf
from tf_agents.networks import actor_distribution_network
from tf_agents.specs import tensor_spec
from tf_agents.trajectories import time_step as ts
import threading

observation_spec = tensor_spec.BoundedTensorSpec((84, 84, 3), tf.float32, 0, 1)
time_step_spec = ts.time_step_spec(observation_spec)
time_step = tensor_spec.sample_spec_nest(time_step_spec, outer_dims=(1,))
action_spec = tensor_spec.BoundedTensorSpec((), tf.int32, 0, 13)

net = actor_distribution_network.ActorDistributionNetwork(
    observation_spec,
    action_spec,
    conv_layer_params=[(32, 8, 4), (64, 4, 2), (64, 3, 1)],
    fc_layer_params=[256,] * 100,  # big network with many layers
    activation_fn=tf.nn.elu)

def test():
    for i in range(100):
        dist, _ = net(time_step.observation, time_step.step_type, ())
        print(dist.sample())
        print(dist.logits)

if __name__ == '__main__':
    ths = []
    for i in range(2):
        ths.append(threading.Thread(target=test, ))
    for t in ths:
        t.start()

some time it raise exception

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/witwolf/Develop/FLA/alf/shell/test.py", line 22, in test
    dist, _ = net(time_step.observation, time_step.step_type, ())
  File "/Users/witwolf/Develop/tf-env-2.0/lib/python3.6/site-packages/tf_agents/networks/network.py", line 171, in __call__
    return super(Network, self).__call__(inputs, *args, **kwargs)
  File "/Users/witwolf/Develop/tf-env-2.0/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/base_layer.py", line 822, in __call__
    outputs = self.call(cast_inputs, *args, **kwargs)
  File "/Users/witwolf/Develop/tf-env-2.0/lib/python3.6/site-packages/tf_agents/networks/actor_distribution_network.py", line 168, in call
    training=training)
  File "/Users/witwolf/Develop/tf-env-2.0/lib/python3.6/site-packages/tf_agents/networks/network.py", line 171, in __call__
    return super(Network, self).__call__(inputs, *args, **kwargs)
  File "/Users/witwolf/Develop/tf-env-2.0/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/base_layer.py", line 822, in __call__
    outputs = self.call(cast_inputs, *args, **kwargs)
  File "/Users/witwolf/Develop/tf-env-2.0/lib/python3.6/site-packages/tf_agents/networks/encoding_network.py", line 327, in call
    states = layer(states, training=training)
  File "/Users/witwolf/Develop/tf-env-2.0/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/base_layer.py", line 822, in __call__
    outputs = self.call(cast_inputs, *args, **kwargs)
  File "/Users/witwolf/Develop/tf-env-2.0/lib/python3.6/site-packages/tensorflow_core/python/keras/layers/core.py", line 1142, in call
    outputs = gen_math_ops.mat_mul(inputs, self.kernel)
AttributeError: 'Dense' object has no attribute 'kernel'

i guess network.build() is not thread safe ,we should build network first for parallel rollout

hnyu commented 4 years ago

i also add a test that same as parallel env rollout

import tensorflow as tf
from tf_agents.networks import actor_distribution_network
from tf_agents.specs import tensor_spec
from tf_agents.trajectories import time_step as ts
import threading

observation_spec = tensor_spec.BoundedTensorSpec((84, 84, 3), tf.float32, 0, 1)
time_step_spec = ts.time_step_spec(observation_spec)
time_step = tensor_spec.sample_spec_nest(time_step_spec, outer_dims=(1,))
action_spec = tensor_spec.BoundedTensorSpec((), tf.int32, 0, 13)

net = actor_distribution_network.ActorDistributionNetwork(
    observation_spec,
    action_spec,
    conv_layer_params=[(32, 8, 4), (64, 4, 2), (64, 3, 1)],
    fc_layer_params=[256,] * 100,  # big network with many layers
    activation_fn=tf.nn.elu)

def test():
    for i in range(100):
        dist, _ = net(time_step.observation, time_step.step_type, ())
        print(dist.sample())
        print(dist.logits)

if __name__ == '__main__':
    ths = []
    for i in range(2):
        ths.append(threading.Thread(target=test, ))
    for t in ths:
        t.start()

some time it raise exception

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/witwolf/Develop/FLA/alf/shell/test.py", line 22, in test
    dist, _ = net(time_step.observation, time_step.step_type, ())
  File "/Users/witwolf/Develop/tf-env-2.0/lib/python3.6/site-packages/tf_agents/networks/network.py", line 171, in __call__
    return super(Network, self).__call__(inputs, *args, **kwargs)
  File "/Users/witwolf/Develop/tf-env-2.0/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/base_layer.py", line 822, in __call__
    outputs = self.call(cast_inputs, *args, **kwargs)
  File "/Users/witwolf/Develop/tf-env-2.0/lib/python3.6/site-packages/tf_agents/networks/actor_distribution_network.py", line 168, in call
    training=training)
  File "/Users/witwolf/Develop/tf-env-2.0/lib/python3.6/site-packages/tf_agents/networks/network.py", line 171, in __call__
    return super(Network, self).__call__(inputs, *args, **kwargs)
  File "/Users/witwolf/Develop/tf-env-2.0/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/base_layer.py", line 822, in __call__
    outputs = self.call(cast_inputs, *args, **kwargs)
  File "/Users/witwolf/Develop/tf-env-2.0/lib/python3.6/site-packages/tf_agents/networks/encoding_network.py", line 327, in call
    states = layer(states, training=training)
  File "/Users/witwolf/Develop/tf-env-2.0/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/base_layer.py", line 822, in __call__
    outputs = self.call(cast_inputs, *args, **kwargs)
  File "/Users/witwolf/Develop/tf-env-2.0/lib/python3.6/site-packages/tensorflow_core/python/keras/layers/core.py", line 1142, in call
    outputs = gen_math_ops.mat_mul(inputs, self.kernel)
AttributeError: 'Dense' object has no attribute 'kernel'

i guess network.build() is not thread safe ,we should build network first for parallel rollout

This is interesting. One possible reason for this new issue (I believe I didn't have this before) is that now all specs are lazy-prepared when the specs is first used during training (the threads have been started by then). Before we always first prepare specs in the __init__ function before actually launching the threads and the preparation requires building and forwarding networks. @emailweixu What do you think?

hnyu commented 4 years ago

when test with ppo_async_icm_super_mario_intrinsic_only

rm -rf tmp && python3 -m alf.bin.train \
 --root_dir=tmp \
 --gin_file=ppo_async_icm_super_mario_intrinsic_only.gin \
 --gin_param=TrainerConfig.random_seed=0 \
 --gin_param=create_environment.num_parallel_environments=1 \
 --gin_param=TrainerConfig.num_iterations=2 \
 --gin_param=TrainerConfig.num_steps_per_iter=1 \
 --gin_param=TrainerConfig.num_updates_per_train_step=1 \
 --gin_param=TrainerConfig.mini_batch_length=2 \
 --gin_param=TrainerConfig.mini_batch_size=4 \
 --gin_param=TrainerConfig.num_envs=2 \
 --gin_param=ReplayBuffer.max_length=64 \
 --gin_param=TrainerConfig.unroll_length=2 \
 --gin_param=TrainerConfig.num_updates_per_train_step=2 \
 --gin_param=TrainerConfig.use_tf_functions=False

get error msg:

  ...
  File "/home/hongyingxiang/FLA/alf/drivers/threads.py", line 410, in _step
    self._env.step(action), action, first_env_id=self._first_env_id)
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/tf_environment.py", line 232, in step
    return self._step(action)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/autograph/impl/api.py", line 292, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/tf_py_environment.py", line 319, in _step
    name='step_py_func')
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/script_ops.py", line 591, in numpy_function
    return py_func_common(func, inp, Tout, stateful=True, name=name)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/script_ops.py", line 488, in py_func_common
    result = func(*[x.numpy() for x in inp])
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/tf_py_environment.py", line 302, in _isolated_step_py
    return self._execute(_step_py, *flattened_actions)
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/tf_py_environment.py", line 195, in _execute
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/tf_py_environment.py", line 298, in _step_py
    self._time_step = self._env.step(packed)
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/py_environment.py", line 174, in step
    self._current_time_step = self._step(action)
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/parallel_py_environment.py", line 136, in _step
    time_steps = [promise() for promise in time_steps]
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/parallel_py_environment.py", line 136, in <listcomp>
    time_steps = [promise() for promise in time_steps]
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/parallel_py_environment.py", line 338, in _receive
    raise Exception(stacktrace)
Exception: Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/parallel_py_environment.py", line 377, in _worker
    result = getattr(env, name)(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/py_environment.py", line 174, in step
    self._current_time_step = self._step(action)
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/wrappers.py", line 105, in _step
    time_step = self._env.step(action)
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/py_environment.py", line 174, in step
    self._current_time_step = self._step(action)
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/gym_wrapper.py", line 197, in _step
    observation, reward, self._done, self._info = self._gym_env.step(action)
  File "/usr/local/lib/python3.6/dist-packages/gym/core.py", line 282, in step
    return self.env.step(self.action(action))
  File "/home/hongyingxiang/FLA/alf/environments/mario_wrappers.py", line 121, in action
    for i in self._actions[a]:
IndexError: list index out of range

and with diff

diff --git a/alf/algorithms/actor_critic_algorithm.py b/alf/algorithms/actor_critic_algorithm.py
index 20216fa..836e9dc 100644
--- a/alf/algorithms/actor_critic_algorithm.py
+++ b/alf/algorithms/actor_critic_algorithm.py
@@ -110,6 +110,8 @@ class ActorCriticAlgorithm(OnPolicyAlgorithm):
             step_type=time_step.step_type,
             network_state=state.actor)

+        import threading
+        print(action_distribution.logits[0][:4], threading.current_thread().ident)
         action = common.sample_action_distribution(action_distribution)
         return PolicyStep(
             action=action,

i get

  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/py_environment.py", line 174, in step
    self._current_time_step = self._step(action)
  File "/usr/local/lib/python3.6/dist-packages/tf_agents/environments/gym_wrapper.py", line 197, in _step
    observation, reward, self._done, self._info = self._gym_env.step(action)
  File "/usr/local/lib/python3.6/dist-packages/gym/core.py", line 282, in step
    return self.env.step(self.action(action))
  File "/home/hongyingxiang/FLA/alf/environments/mario_wrappers.py", line 121, in action
    for i in self._actions[a]:
IndexError: list index out of range

tf.Tensor([nan nan nan nan], shape=(4,), dtype=float32) 140210483619648
tf.Tensor([nan nan nan nan], shape=(4,), dtype=float32) 140210483619648
tf.Tensor([nan nan nan nan], shape=(4,), dtype=float32) 140210483619648

logits for distributions.Categorical might be nan (it's very easy to reproduce this issue)

can you help take a look for this issue @emailweixu @hnyu

Let's first fix the second one (thread-safe) and then come back to this one. It might be related to the threading issue.

emailweixu commented 4 years ago

i also add a test that same as parallel env rollout

import tensorflow as tf
from tf_agents.networks import actor_distribution_network
from tf_agents.specs import tensor_spec
from tf_agents.trajectories import time_step as ts
import threading

observation_spec = tensor_spec.BoundedTensorSpec((84, 84, 3), tf.float32, 0, 1)
time_step_spec = ts.time_step_spec(observation_spec)
time_step = tensor_spec.sample_spec_nest(time_step_spec, outer_dims=(1,))
action_spec = tensor_spec.BoundedTensorSpec((), tf.int32, 0, 13)

net = actor_distribution_network.ActorDistributionNetwork(
    observation_spec,
    action_spec,
    conv_layer_params=[(32, 8, 4), (64, 4, 2), (64, 3, 1)],
    fc_layer_params=[256,] * 100,  # big network with many layers
    activation_fn=tf.nn.elu)

def test():
    for i in range(100):
        dist, _ = net(time_step.observation, time_step.step_type, ())
        print(dist.sample())
        print(dist.logits)

if __name__ == '__main__':
    ths = []
    for i in range(2):
        ths.append(threading.Thread(target=test, ))
    for t in ths:
        t.start()

some time it raise exception

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/witwolf/Develop/FLA/alf/shell/test.py", line 22, in test
    dist, _ = net(time_step.observation, time_step.step_type, ())
  File "/Users/witwolf/Develop/tf-env-2.0/lib/python3.6/site-packages/tf_agents/networks/network.py", line 171, in __call__
    return super(Network, self).__call__(inputs, *args, **kwargs)
  File "/Users/witwolf/Develop/tf-env-2.0/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/base_layer.py", line 822, in __call__
    outputs = self.call(cast_inputs, *args, **kwargs)
  File "/Users/witwolf/Develop/tf-env-2.0/lib/python3.6/site-packages/tf_agents/networks/actor_distribution_network.py", line 168, in call
    training=training)
  File "/Users/witwolf/Develop/tf-env-2.0/lib/python3.6/site-packages/tf_agents/networks/network.py", line 171, in __call__
    return super(Network, self).__call__(inputs, *args, **kwargs)
  File "/Users/witwolf/Develop/tf-env-2.0/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/base_layer.py", line 822, in __call__
    outputs = self.call(cast_inputs, *args, **kwargs)
  File "/Users/witwolf/Develop/tf-env-2.0/lib/python3.6/site-packages/tf_agents/networks/encoding_network.py", line 327, in call
    states = layer(states, training=training)
  File "/Users/witwolf/Develop/tf-env-2.0/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/base_layer.py", line 822, in __call__
    outputs = self.call(cast_inputs, *args, **kwargs)
  File "/Users/witwolf/Develop/tf-env-2.0/lib/python3.6/site-packages/tensorflow_core/python/keras/layers/core.py", line 1142, in call
    outputs = gen_math_ops.mat_mul(inputs, self.kernel)
AttributeError: 'Dense' object has no attribute 'kernel'

i guess network.build() is not thread safe ,we should build network first for parallel rollout

Now Network has a member create_variables. Perhaps calling it in the beginning can solve the problem?

emailweixu commented 4 years ago

This is interesting. One possible reason for this new issue (I believe I didn't have this before) is that now all specs are lazy-prepared when the specs is first used during training (the threads have been started by then). Before we always first prepare specs in the __init__ function before actually launching the threads and the preparation requires building and forwarding networks. @emailweixu What do you think?

But it seems to me that even for async training, all the specs are prepared in the main thread. Do you see any specs prepared in thread?

hnyu commented 4 years ago

This is interesting. One possible reason for this new issue (I believe I didn't have this before) is that now all specs are lazy-prepared when the specs is first used during training (the threads have been started by then). Before we always first prepare specs in the __init__ function before actually launching the threads and the preparation requires building and forwarding networks. @emailweixu What do you think?

But it seems to me that even for async training, all the specs are prepared in the main thread. Do you see any specs prepared in thread?

Yeah, it seems that the current async code won't have a threading issue. In @witwolf 's small example there are two parallel actor threads which might compete to build the network.