muzero Search Results - Githubissues

409 results
for muzero

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

werner-duvaud/muzero-general #62

Loss going up - is it ok?

Training the model for 10h (RTX6000) on Connect4. Is it ok that only the policy loss goes down over time, while others go up? If I understand correctly, lowering the learning rate might help? What…

Elijas updated 4 years ago
2
werner-duvaud/muzero-general #75

Unnecessary self-play after checkpoint load

I was doing training on a custom game, with ``replay_buffer`` of 1000 and a ``ratio`` of 1.5. In my training session, about 1100 games were played, so some of the games had to be removed from the repl…

astronasko updated 4 years ago
3
werner-duvaud/muzero-general #80

Influence of Network Layers on Model Ability

Mr. Duvaud, I have been working on implementations of mobile games -- mainly Clash Royale and Crossy Road -- using this MuZero repository in order to test its potential in modern, easy to learn, ha…

jhawgs updated 4 years ago
2
werner-duvaud/muzero-general #58

Retrying to connect to socket for pathname tcp://127.0.0.1:5…

Hi, I cloned your repo and tried to use it with a fresh environment. Unfortunately, it seems that it does not work properly, as I get the following error when trying to train or play against muzer…

seboz123 updated 4 years ago
4
werner-duvaud/muzero-general #69

lunarlander with low reward

When I load the pretrained model [here](https://github.com/werner-duvaud/muzero-general/blob/master/results/lunarlander/experiment1/model.weights), [the error](https://github.com/werner-duvaud/muzero-…

huajingyun updated 4 years ago
2
PaddlePaddle/PARL #348

BicNet算法复现，双向lstm网络有问题

有没有复现BicNet算法的计划，我自己在尝试写BicNet算法的时候发现paddlepaddle的双向lstm接口有问题。已经有人提出了这个问题 [https://github.com/PaddlePaddle/Paddle/issues/22979#issue-579681421](url)

zienn updated 4 years ago
2
werner-duvaud/muzero-general #64

Resume after error

Hi after running for a few days, the training suddenly failed. ``` 2020-08-05 07:56:18,288 ERROR worker.py:1049 -- listen_error_messages_raylet: Connection closed by server. E0805 07:56:18.288650…

MuMaxAI updated 4 years ago
2
junxiaosong/AlphaZero_Gomoku #104

看完代码觉得这个Implemention有问题，欢迎指正

把作者的代码读了一遍，觉得有个地方有问题。按照我的理解作者这里把每局的replay简单的所有局面赋予了相同的z值，我按一种分支走法走到底，如果这局白棋赢了，对于这一局的所有states都赋予白棋赢的标签。然而任何一篇alphago论文都不是这么干的，包括alpha lee的文章，一开始就是有把单次搜索（可能是几千几万盘end_game）做一个统计，才能得出一个当前局面的value或者a…

ylf11235 updated 4 years ago
1
werner-duvaud/muzero-general #40

Loss: nan

I often get this error after training for a few hours. It has happened in all the games I've tried (but I've only tried two-player games). The error message below is from tictactoe. If this only happe…

guskal01 updated 4 years ago
7
werner-duvaud/muzero-general #55

Alternating reward signs in backpropagate

In https://github.com/werner-duvaud/muzero-general/blob/98cb784a06a8c25fe4a99a3c71d5358b357c5ef0/self_play.py#L384 the value to be backpropagated along the search path incorporates rewards from…

fidel-schaposnik updated 4 years ago
8

上一页 1...34 35 36 37 38 39 40...41 下一页

409 results for muzero

409 results
for muzero