yhyu13 / AlphaGOZero-python-tensorflow

Congratulation to DeepMind! This is a reengineering implementation (on behalf of many other git repo in /support/) of DeepMind's Oct19th publication: [Mastering the Game of Go without Human Knowledge]. The supervised learning approach is more practical for individuals. (This repository has single purpose of education only)
MIT License
341 stars 115 forks source link

main.py: error: unrecognized arguments: —-policy=randompolicy #12

Closed arisliang closed 6 years ago

arisliang commented 6 years ago

Simply copy paste the command in README, would have unrecognized arguments for policy error.

What does this argument do?

yhyu13 commented 6 years ago

I apologize it was an outdated argument. It should be --gtp_policy. You can find it here

And there are several strategies you can pick for the underlying Go playing agent. The default option is to use DNN+MCTS, while picking a random policy could validate a working program without invoking expensive neural networks.

For the implementation of strategies, see utils/strategies.py. And you might be interested in the training process of this Go agent, the central training strategy can be found here