5_A3C Cartpole Script - AttributeError: 'Functional' object has no attribute '_make_predict_function'

windowshopr commented 3 years ago

References a function that doesn't exist.

ShakthiYasas commented 3 years ago

make_predict_function() should work instead.

windowshopr commented 3 years ago

That worked, but more issues now.

I'm trying to run it in Colab. Changed the import of TF to this at the top:

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

...so that it would run version 1.X in accordance to the notes for this repo. Running it as is gives this error:

TypeError                                 Traceback (most recent call last)
<ipython-input-8-00b007ee4380> in <module>()
    226     env.close()
    227 
--> 228     global_agent = A3CAgent(state_size, action_size, env_name)
    229     global_agent.train()

1 frames
<ipython-input-8-00b007ee4380> in __init__(self, state_size, action_size, env_name)
     41 
     42         # method for training actor and critic network
---> 43         self.optimizer = [self.actor_optimizer(), self.critic_optimizer()]
     44 
     45         self.sess = tf.InteractiveSession()

<ipython-input-8-00b007ee4380> in actor_optimizer(self)
     93 
     94         optimizer = Adam(lr=self.actor_lr)
---> 95         updates = optimizer.get_updates(self.actor.trainable_weights, [], actor_loss)
     96         train = K.function([self.actor.input, action, advantages], [], updates=updates)

TypeError: get_updates() takes 3 positional arguments but 4 were given

...so tried to change this line:

updates = optimizer.get_updates(self.actor.trainable_weights, [], actor_loss)

to

updates = optimizer.get_updates(params=self.actor.trainable_weights, loss=actor_loss)

...then we get another error at the very next line, which is:

IndexError                                Traceback (most recent call last)
<ipython-input-9-d4c8c185a397> in <module>()
    226     env.close()
    227 
--> 228     global_agent = A3CAgent(state_size, action_size, env_name)
    229     global_agent.train()

3 frames
<ipython-input-9-d4c8c185a397> in __init__(self, state_size, action_size, env_name)
     41 
     42         # method for training actor and critic network
---> 43         self.optimizer = [self.actor_optimizer(), self.critic_optimizer()]
     44 
     45         self.sess = tf.InteractiveSession()

<ipython-input-9-d4c8c185a397> in actor_optimizer(self)
     95         # updates = optimizer.get_updates(self.actor.trainable_weights, [], actor_loss)
     96         updates = optimizer.get_updates(params=self.actor.trainable_weights, loss=actor_loss)
---> 97         train = K.function([self.actor.input, action, advantages], [], updates=updates)
     98         return train
     99 

/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/backend.py in function(inputs, outputs, updates, name, **kwargs)
   4085         raise ValueError(msg)
   4086   return GraphExecutionFunction(
-> 4087       inputs, outputs, updates=updates, name=name, **kwargs)
   4088 
   4089 

/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/backend.py in __init__(self, inputs, outputs, updates, name, **session_kwargs)
   3808     # dependencies in call.
   3809     # Index 0 = total loss or model output for `predict`.
-> 3810     with ops.control_dependencies([self.outputs[0]]):
   3811       updates_ops = []
   3812       for update in updates:

IndexError: list index out of range

So I'm going to throw in the towel for now. Would be nice to see this repo updated! Any ideas? Thanks!

ShakthiYasas commented 3 years ago

@windowshopr there is nothing much to this. You just need to replace,

train = K.function([self.actor.input, action, advantages], [], updates=updates)

as

train = K.function([self.actor.input, action, advantages], self.model.output, updates=updates)

windowshopr commented 3 years ago

Nope, there's no "self.model", either "self.actor" or "self.critic".

So continuing on, I've attempted the following changes:

    # make loss function for Policy Gradient
    # [log(action probability) * advantages] will be input for the back prop
    # we add entropy of action probability to loss
    def actor_optimizer(self):
        action = K.placeholder(shape=(None, self.action_size))
        advantages = K.placeholder(shape=(None, ))

        policy = self.actor.output

        good_prob = K.sum(action * policy, axis=1)
        eligibility = K.log(good_prob + 1e-10) * K.stop_gradient(advantages)
        loss = -K.sum(eligibility)

        entropy = K.sum(policy * K.log(policy + 1e-10), axis=1)

        actor_loss = loss + 0.01*entropy

        optimizer = Adam(lr=self.actor_lr)
        # updates = optimizer.get_updates(self.actor.trainable_weights, [], actor_loss)
        updates = optimizer.get_updates(params=self.actor.trainable_weights, loss=actor_loss)
        # train = K.function([self.actor.input, action, advantages], [], updates=updates)
        train = K.function([self.actor.input, action, advantages], self.actor.output, updates=updates)
        return train

    # make loss function for Value approximation
    def critic_optimizer(self):
        discounted_reward = K.placeholder(shape=(None, ))

        value = self.critic.output

        loss = K.mean(K.square(discounted_reward - value))

        optimizer = Adam(lr=self.critic_lr)
        # updates = optimizer.get_updates(self.critic.trainable_weights, [], loss)
        updates = optimizer.get_updates(params=self.critic.trainable_weights, loss=loss)
        # train = K.function([self.critic.input, discounted_reward], [], updates=updates)
        train = K.function([self.critic.input, discounted_reward], self.critic.output, updates=updates)
        return train

...which now gives the error:

AttributeError: module 'keras.backend' has no attribute 'set_session'

Assuming this is a Keras version issue, I replaced the importing of Keras at the top with:

from tensorflow.compat.v1.keras.layers import Dense, Input
from tensorflow.compat.v1.keras.models import Model
from tensorflow.compat.v1.keras.optimizers import Adam
from tensorflow.compat.v1.keras import backend as K

...which now gives these errors in the output:

Exception in thread Thread-42:
Traceback (most recent call last):
  File "/usr/lib/python3.7/threading.py", line 926, in _bootstrap_inner
    self.run()
  File "<ipython-input-27-2bcae64a7583>", line 173, in run
    action = self.get_action(state)
  File "<ipython-input-27-2bcae64a7583>", line 223, in get_action
    policy = self.actor.predict(np.reshape(state, [1, self.state_size]))[0]
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training_v1.py", line 991, in predict
    use_multiprocessing=use_multiprocessing)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training_arrays_v1.py", line 712, in predict
    callbacks=callbacks)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training_arrays_v1.py", line 384, in model_iteration
    batch_outs = f(ins_batch)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/def_function.py", line 828, in __call__
    result = self._call(*args, **kwds)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/def_function.py", line 862, in _call
    results = self._stateful_fn(*args, **kwds)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 2941, in __call__
    filtered_flat_args) = self._maybe_define_function(args, kwargs)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 3361, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 3206, in _create_graph_function
    capture_by_value=self._capture_by_value),
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py", line 990, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/def_function.py", line 634, in wrapped_fn
    out = weak_wrapped_fn().__wrapped__(*args, **kwds)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py", line 977, in wrapper
    raise e.ag_error_metadata.to_exception(e)
TypeError: in user code:

    /usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py:1478 predict_function  *
        return step_function(self, iterator)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py:1467 step_function  **
        data = next(iterator)

    TypeError: 'list' object is not an iterator

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-27-2bcae64a7583> in <module>()
    235 
    236     global_agent = A3CAgent(state_size, action_size, env_name)
--> 237     global_agent.train()

7 frames
/usr/local/lib/python3.7/dist-packages/matplotlib/cbook/__init__.py in to_filehandle(fname, flag, return_opened, encoding)
    401             fh = bz2.BZ2File(fname, flag)
    402         else:
--> 403             fh = open(fname, flag, encoding=encoding)
    404         opened = True
    405     elif hasattr(fname, 'seek'):

FileNotFoundError: [Errno 2] No such file or directory: './save_graph/cartpole_a3c.png'

As you can see, more effort is needed to debugging this thing than I'm willing to put in. I would urge you or anyone perhaps more knowledgeable than me to throw this into a Colab notebook to try and get it working yourself and document the necessary changes here in an effort to keep the code up to date, both for myself and anyone else who stumbles across this repo like I have and would like to see it working. Thanks!

rlcode / reinforcement-learning

5_A3C Cartpole Script - AttributeError: 'Functional' object has no attribute '_make_predict_function' #101