rlcode / reinforcement-learning

Minimal and Clean Reinforcement Learning Examples
MIT License
3.35k stars 725 forks source link

Variable Tensor("Neg:0", shape=(), dtype=float32) has `None` for gradient. #104

Closed ShakthiYasas closed 3 years ago

ShakthiYasas commented 3 years ago

Hey all!

I get the following issue running, reinforce_agent.py and all of under, 3-atari. Prior to this, this same lines caused, Tensor to Numpy array issue which was resolved by adding,

from tensorflow.python.framework.ops import disable_eager_execution
disable_eager_execution()

Provided above, I'm now faced with this issue. I've tried various solutions including using K.eval(loss) before, but that cause some other issue. My tensorfzlow version is 2.4.1, Keras version 2.4.3 and Numpy version 1.19.5.

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 24)                384       
_________________________________________________________________
dense_1 (Dense)              (None, 24)                600       
_________________________________________________________________
dense_2 (Dense)              (None, 5)                 125       
=================================================================
Total params: 1,109
Trainable params: 1,109
Non-trainable params: 0
_________________________________________________________________
Traceback (most recent call last):
  File "1-grid-world/7-reinforce/reinforce_agent.py", line 95, in <module>
    agent = ReinforceAgent()
  File "1-grid-world/7-reinforce/reinforce_agent.py", line 28, in __init__
    self.optimizer = self.optimizer()
  File "1-grid-world/7-reinforce/reinforce_agent.py", line 55, in optimizer
    updates = optimizer.get_updates(self.model.trainable_weights, loss)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py", line 727, in get_updates
    grads = self.get_gradients(loss, params)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py", line 719, in get_gradients
    raise ValueError("Variable {} has `None` for gradient. "
ValueError: Variable Tensor("Neg:0", shape=(), dtype=float32) has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.

Any solution to this? @Hyeokreal @keon

ShakthiYasas commented 3 years ago

This was resolved as follows by making the following code change:

    def optimizer(self):
        action = K.placeholder(dtype=float, shape=(None, 5))
        discounted_rewards = K.placeholder(shape=(None,))

        # Calculate cross entropy error function
        action_prob = K.sum(action * self.model.output, axis=1) 
        cross_entropy = K.log(action_prob) * discounted_rewards
        loss = -K.sum(cross_entropy)

        # create training function
        optimizer = Adam(lr=self.learning_rate)
        updates = optimizer.get_updates(params=self.model.trainable_weights, loss=loss)
        train = K.function(inputs=[self.model.input, action, discounted_rewards], outputs=self.model.output, updates=updates)

        return train