Closed mpnunez closed 2 months ago
Q predicitons with MSE cost function often explode when the policy gets stuck
This did not help when I tried it. Gradient clipping is used more for RNN and very deep networks. There is some other issue causing loss function explosion in our DQN algorithm.
Q predicitons with MSE cost function often explode when the policy gets stuck