YuriCat / MuZeroJupyterExample

63 stars 15 forks source link

Recent progress #4

Open ZHANGRUI666 opened 4 years ago

ZHANGRUI666 commented 4 years ago

Hi,Yuri!how are you there? and I want to know about your recent progress in Muzero project, has your model converged?I built my Muzero to play renju,but after several hundred epochs of training, it still showed little enhancement too, which makes me kind of frustrated, would you share with me some intermediate output of the model such as hideen status, the predicated probability or evaluated value of one particular board configuration? we can communicate each other and analyse the problem

YuriCat commented 4 years ago

Hi,

These days I'm not touching Muzero code. I would be appreciate if you find new key points to archirve good result.

After other RL experiments, I found that ReLU sometimes worked worse than other activations for small neural nets.

ZHANGRUI666 commented 4 years ago

Thanks for your reply! Yuri, I am glad to hear your response, I inspect the hidden status these days, and i try to reveal as more details as possible , if i have any discoveries,i will tell you . ReLU-driven neural nets do have some flaw i think, and i expect your further advance in this field Figure_1

YuriCat commented 4 years ago

Hi, @ZHANGRUI666

I found careless bug in tree search method. Encoded abstract state had not been updated when descending search tree. Unbelievable! After I fixed it, training looks going well.