michaelnny / deep_rl_zoo

A collection of Deep Reinforcement Learning algorithms implemented with PyTorch to solve Atari games and classic control tasks like CartPole, LunarLander, and MountainCar.
Apache License 2.0
99 stars 8 forks source link

C51 agent not working on Pong #3

Closed michaelnny closed 2 years ago

michaelnny commented 2 years ago

It seems like the C51 agent is not working on Pong, and it can be unstable on CartPole sometimes.

Normally, for DQN like agents using e-greedy policy, we'd expect the agent to make some progress when the 'exploration_epsilon' is less than 0.7, but that's not the case for C51 agent. What makes this issue more interesting is the Rainbow agent works very well on Pong. And the code for Rainbow is almost identical to the C51 agent, except in C51 we don't use noisy layers.

Things we've tried so far, but still didn't solve the issue:

michaelnny commented 2 years ago

After some experiments, we found that the loss for the C51 agent is not coming down as fast as other DQN like agents. So while we set the '--exploration_epsilon_decay_step=500000', the agent failed to learn.

By set the '--exploration_epsilon_decay_step=1000000', it actually start to make some progress after 400k-500k frames.

image