Closed fenjiro closed 5 years ago
Hi,
Unlike supervised learning, in RL you can't always expect the loss to go down, while training an agent. Take DQN for example, the loss type is MSE, where we try to fit a network's prediction to the actual state-action value of the learned policy. But, during the training, the policy gets better, and thus its Q-values are expected to get better over time, i.e. always changing - a moving target. So the MSE will not go down, but instead will be some noisy changing signal, while the agent is actually improving at the task. You can track the loss with Dashboard, and see it changing over time to get a better understanding of its expected behavior, with one of the simpler (toy problems) presets.
For reinforcement learning with Carla using CARLA_Dueling_DDQN and CARLA_DDPG, the loss is increasing. could you please help.