A project-focused experimental optimizer which optimizes over state spaces with limited prior knowledge through self-play. See the Issues for planned features and optimizations.
The goal is to provide an API which makes it easier to set up data-parallel learning loops and to optimize searches by exploiting symmetries, with help from the trait system.
It combines ideas from the papers:
It mainly uses:
dfdx
for type-checked neural networks,rayon
for data-parallel MCTS exploration,python
to Tensorboard.Feedback on a high-level API is welcome. Please make Github Issues before making PRs.
Our first implementation can be summarized:
We restrict our attention to state spaces with a finite number of dimensions whose actions are discrete and commonly enumerated. Our most thorough examples uses the INTMinTree
, as demonstrated here. Examples 01
through 04
have prototypes to re-implement similarly.
The design of the MCTS and the formulation of StateActionSpace
and the INTMinTree
. All variants of the MCTS will be refactored with a similar design. It may be useful to create a CLI to alter hyperparameters during training.
License
Dual-licensed.
Licensed under the Apache License, Version 2.0 or the MIT license, at your option. This file may not be copied, modified, or distributed except according to those terms.