-
I'm new in Deep Reinforcement Learning and my supervisor shows great interets in this work. I have successfully run the code here, but I dont know how to connect this work to the games, I wanna since…
-
Right now it seems that all proof checks are at the end. This violates the Markovian assumption of Lean gym since a bad tactic upstream can lead to a failure of the final tactic step. Most of the cu…
-
你好,我了解到AlphaZero好像是一个on-policy算法,on-policy算法是不适用experience replay的,但是我又在代码里面看到使用了experience replay, 我想知道我的看法是不是对的。
https://github.com/initial-h/AlphaZero_Gomoku_MPI/blob/95867cb7e524ebe9c77a926c82091…
-
I've been getting crashes with the error below on and off, unpredictably, while training my model with the Python API. Sometimes everything will be fine training for days, other times I can't get a si…
-
It looks like AlphaZero's implementation samples uniformly from a replay buffer for each training step. I wonder why they do this instead of iterating through in batches. I am not sure why you would c…
-
最近发现,星阵官网在对katago进行针对性训练。我觉得可以进行一次针对星阵(网页版)的训练,达到可以超过同配置的星阵。双卡2080ti,可以赢3X
-
[bz#275]
..i think it's better if we allow the game creators themselves to change the color of the damage in battle, so that they can indicate if the damage is critical (red - colored), weak (yellow…
-
Considering the AG paper is mainly themed on "MCTS as a policy improvement operator". In that sense, is it possible to do training w/o full games?
AKA, just take any board position, and train the …
-
When I try to run `python tic_tac_toe_alpha_zero.py` I get this error:
```
Exception caught in actor-0: Failed call to cuDeviceGet: CUDA_ERROR_NOT_INITIALIZED: initialization error
actor-0 exiting
…
-
Currently, the strongest AI on CGoS is one that exclusively uses the black hole opening, beating KataGo something like 70-90% of the time. But is it a blind spot in Katago that makes it unable to han…