Thank you for the great feedback for my last blocker (#1). I moved the RND network to the agent so that it can be trained. To do so, I created a RndDqnAgent class that inherits the DqnAgent class. The _loss() function has been modified to compute RND loss as well, which is used in _train() function as intrinsic reward and to train the RND network.
However, I am noticing a strange behavior for TensorBoard.
The value is only updated every 1k steps, and
RND loss is not being plotted even though it summary.scalar() is invoked.
Do you have any ideas about what could be the cause of the problem?
(Please ignore the changes in rnd_wrapper.py and nest_utils.py, since they will be reverted.)
Thank you for the great feedback for my last blocker (#1). I moved the RND network to the agent so that it can be trained. To do so, I created a
RndDqnAgent
class that inherits theDqnAgent
class. The_loss()
function has been modified to compute RND loss as well, which is used in_train()
function as intrinsic reward and to train the RND network.However, I am noticing a strange behavior for TensorBoard.
summary.scalar()
is invoked.Do you have any ideas about what could be the cause of the problem?
(Please ignore the changes in
rnd_wrapper.py
andnest_utils.py
, since they will be reverted.)