Open windowshopr opened 3 years ago
make_predict_function() should work instead.
That worked, but more issues now.
I'm trying to run it in Colab. Changed the import of TF to this at the top:
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
...so that it would run version 1.X in accordance to the notes for this repo. Running it as is gives this error:
TypeError Traceback (most recent call last)
<ipython-input-8-00b007ee4380> in <module>()
226 env.close()
227
--> 228 global_agent = A3CAgent(state_size, action_size, env_name)
229 global_agent.train()
1 frames
<ipython-input-8-00b007ee4380> in __init__(self, state_size, action_size, env_name)
41
42 # method for training actor and critic network
---> 43 self.optimizer = [self.actor_optimizer(), self.critic_optimizer()]
44
45 self.sess = tf.InteractiveSession()
<ipython-input-8-00b007ee4380> in actor_optimizer(self)
93
94 optimizer = Adam(lr=self.actor_lr)
---> 95 updates = optimizer.get_updates(self.actor.trainable_weights, [], actor_loss)
96 train = K.function([self.actor.input, action, advantages], [], updates=updates)
TypeError: get_updates() takes 3 positional arguments but 4 were given
...so tried to change this line:
updates = optimizer.get_updates(self.actor.trainable_weights, [], actor_loss)
to
updates = optimizer.get_updates(params=self.actor.trainable_weights, loss=actor_loss)
...then we get another error at the very next line, which is:
IndexError Traceback (most recent call last)
<ipython-input-9-d4c8c185a397> in <module>()
226 env.close()
227
--> 228 global_agent = A3CAgent(state_size, action_size, env_name)
229 global_agent.train()
3 frames
<ipython-input-9-d4c8c185a397> in __init__(self, state_size, action_size, env_name)
41
42 # method for training actor and critic network
---> 43 self.optimizer = [self.actor_optimizer(), self.critic_optimizer()]
44
45 self.sess = tf.InteractiveSession()
<ipython-input-9-d4c8c185a397> in actor_optimizer(self)
95 # updates = optimizer.get_updates(self.actor.trainable_weights, [], actor_loss)
96 updates = optimizer.get_updates(params=self.actor.trainable_weights, loss=actor_loss)
---> 97 train = K.function([self.actor.input, action, advantages], [], updates=updates)
98 return train
99
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/backend.py in function(inputs, outputs, updates, name, **kwargs)
4085 raise ValueError(msg)
4086 return GraphExecutionFunction(
-> 4087 inputs, outputs, updates=updates, name=name, **kwargs)
4088
4089
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/backend.py in __init__(self, inputs, outputs, updates, name, **session_kwargs)
3808 # dependencies in call.
3809 # Index 0 = total loss or model output for `predict`.
-> 3810 with ops.control_dependencies([self.outputs[0]]):
3811 updates_ops = []
3812 for update in updates:
IndexError: list index out of range
So I'm going to throw in the towel for now. Would be nice to see this repo updated! Any ideas? Thanks!
@windowshopr there is nothing much to this. You just need to replace,
train = K.function([self.actor.input, action, advantages], [], updates=updates)
as
train = K.function([self.actor.input, action, advantages], self.model.output, updates=updates)
Nope, there's no "self.model", either "self.actor" or "self.critic".
So continuing on, I've attempted the following changes:
# make loss function for Policy Gradient
# [log(action probability) * advantages] will be input for the back prop
# we add entropy of action probability to loss
def actor_optimizer(self):
action = K.placeholder(shape=(None, self.action_size))
advantages = K.placeholder(shape=(None, ))
policy = self.actor.output
good_prob = K.sum(action * policy, axis=1)
eligibility = K.log(good_prob + 1e-10) * K.stop_gradient(advantages)
loss = -K.sum(eligibility)
entropy = K.sum(policy * K.log(policy + 1e-10), axis=1)
actor_loss = loss + 0.01*entropy
optimizer = Adam(lr=self.actor_lr)
# updates = optimizer.get_updates(self.actor.trainable_weights, [], actor_loss)
updates = optimizer.get_updates(params=self.actor.trainable_weights, loss=actor_loss)
# train = K.function([self.actor.input, action, advantages], [], updates=updates)
train = K.function([self.actor.input, action, advantages], self.actor.output, updates=updates)
return train
# make loss function for Value approximation
def critic_optimizer(self):
discounted_reward = K.placeholder(shape=(None, ))
value = self.critic.output
loss = K.mean(K.square(discounted_reward - value))
optimizer = Adam(lr=self.critic_lr)
# updates = optimizer.get_updates(self.critic.trainable_weights, [], loss)
updates = optimizer.get_updates(params=self.critic.trainable_weights, loss=loss)
# train = K.function([self.critic.input, discounted_reward], [], updates=updates)
train = K.function([self.critic.input, discounted_reward], self.critic.output, updates=updates)
return train
...which now gives the error:
AttributeError: module 'keras.backend' has no attribute 'set_session'
Assuming this is a Keras version issue, I replaced the importing of Keras at the top with:
from tensorflow.compat.v1.keras.layers import Dense, Input
from tensorflow.compat.v1.keras.models import Model
from tensorflow.compat.v1.keras.optimizers import Adam
from tensorflow.compat.v1.keras import backend as K
...which now gives these errors in the output:
Exception in thread Thread-42:
Traceback (most recent call last):
File "/usr/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "<ipython-input-27-2bcae64a7583>", line 173, in run
action = self.get_action(state)
File "<ipython-input-27-2bcae64a7583>", line 223, in get_action
policy = self.actor.predict(np.reshape(state, [1, self.state_size]))[0]
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training_v1.py", line 991, in predict
use_multiprocessing=use_multiprocessing)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training_arrays_v1.py", line 712, in predict
callbacks=callbacks)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training_arrays_v1.py", line 384, in model_iteration
batch_outs = f(ins_batch)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/def_function.py", line 828, in __call__
result = self._call(*args, **kwds)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/def_function.py", line 862, in _call
results = self._stateful_fn(*args, **kwds)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 2941, in __call__
filtered_flat_args) = self._maybe_define_function(args, kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 3361, in _maybe_define_function
graph_function = self._create_graph_function(args, kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 3206, in _create_graph_function
capture_by_value=self._capture_by_value),
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py", line 990, in func_graph_from_py_func
func_outputs = python_func(*func_args, **func_kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/def_function.py", line 634, in wrapped_fn
out = weak_wrapped_fn().__wrapped__(*args, **kwds)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py", line 977, in wrapper
raise e.ag_error_metadata.to_exception(e)
TypeError: in user code:
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py:1478 predict_function *
return step_function(self, iterator)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py:1467 step_function **
data = next(iterator)
TypeError: 'list' object is not an iterator
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
<ipython-input-27-2bcae64a7583> in <module>()
235
236 global_agent = A3CAgent(state_size, action_size, env_name)
--> 237 global_agent.train()
7 frames
/usr/local/lib/python3.7/dist-packages/matplotlib/cbook/__init__.py in to_filehandle(fname, flag, return_opened, encoding)
401 fh = bz2.BZ2File(fname, flag)
402 else:
--> 403 fh = open(fname, flag, encoding=encoding)
404 opened = True
405 elif hasattr(fname, 'seek'):
FileNotFoundError: [Errno 2] No such file or directory: './save_graph/cartpole_a3c.png'
As you can see, more effort is needed to debugging this thing than I'm willing to put in. I would urge you or anyone perhaps more knowledgeable than me to throw this into a Colab notebook to try and get it working yourself and document the necessary changes here in an effort to keep the code up to date, both for myself and anyone else who stumbles across this repo like I have and would like to see it working. Thanks!
References a function that doesn't exist.