google-deepmind / deepmind-research

This repository contains implementations and illustrative code to accompany DeepMind publications
Apache License 2.0
13.26k stars 2.6k forks source link

[RL Unplugged] Gravitar loss starts at ~0.000 #156

Closed arjung128 closed 3 years ago

arjung128 commented 3 years ago

I am using the code in atari_dqn.ipynb to train a policy for Gravitar from scratch (on 1 run = 100 shards of data), and this is what my loss log looks like so far:

[Learner] Loss = 0.002 | Steps = 3795 | Walltime = 284.329
[Learner] Loss = 0.000 | Steps = 8004 | Walltime = 584.381
[Learner] Loss = 0.002 | Steps = 12446 | Walltime = 884.459
[Learner] Loss = 0.002 | Steps = 17415 | Walltime = 1184.462
[Learner] Loss = 0.004 | Steps = 21657 | Walltime = 1484.513
[Learner] Loss = 0.000 | Steps = 25803 | Walltime = 1784.591
[Learner] Loss = 0.000 | Steps = 30162 | Walltime = 2084.610

The agent gets to a loss of 0.000 in < 10 minutes, and the loss just seems to be oscillating very close to 0. Is such a low loss expected, or does this suggest an issue with the code I'm running?

arjung128 commented 3 years ago

logging the loss to tensorboard showed that the relevant significant figures have been truncated in the above output and that the loss does nicely decrease using the provided code.