YuriCat / MuesliJupyterExample

16 stars 2 forks source link

no target network ? #2

Open qianfangjj opened 2 years ago

qianfangjj commented 2 years ago

There is no target network which is noted in the article in this implementation?

YuriCat commented 2 years ago

I implemented this algorithm based on MuZero, so I assumed that the training targets are computed with the net for episode generation. However, as you point out, the target network is used in the paper. I will consider whether that is better!

qianfangjj commented 2 years ago

Thank you for your reply. There is also no model loss as the Equation 13 described in the article. Why not use? Have you tried to add this loss term? By the way, Have you considered reproducing the results of the article on Atari game?