I am tried running V-MPO on atari Breakout, and it didn't seem to gain any momentum; Any reason why this might be? I tried changing some of the parameters in the config file and I still didn't get any improvement. Is this how it suppose to be at the beginning of training?
I am tried running V-MPO on atari Breakout, and it didn't seem to gain any momentum; Any reason why this might be? I tried changing some of the parameters in the config file and I still didn't get any improvement. Is this how it suppose to be at the beginning of training?