Closed shindavid closed 2 months ago
Some notes on the first couple of changes:
On arena-allocated policy-vecs: do you envision these to be variable-sized, or of a fixed-size big-enough to handle the max-branching-factor states of the game?
I currently store them local to the node, using a fixed size. This is undesirable because of wasted-space and also because it requires game-specific configuration of a branching-factor upper-bound. But it does allow for lazy child node allocation, which is a form of space saving.
Variable-sized can be easily achieved by just using new float[]
and delete[]
and relying on malloc
to do the memory management work. With more work we can make sure all those floats come from one big memory pool that we allocate up-front, but it's not obvious that this will be much of a win and worth the effort.
I was definitely thinking of making them variable-sized, one of the major reasons of doing this in the first place is to save memory.
To clarify, there are two major ways to store policy:
The first way sadly doesn't work well with lazy child initialization, so I don't think this is the best long-term solution.
For the second approach we can separately decide where to store this array:
malloc
or std::vector
allocations: simple, but will be slightly slower, especially when deallocating the tree (since you need to call free
once per node)I think the right solution is an array of floats in the parent and either of the last two options for where to store it. We can start with malloc
and see if it's good enough.
If using an object pool, we will similarly need to free the memory.
Unless...you are thinking of going with the approach of just doing a big clear of the entire pool in between searches, like kZero currently does for nodes. I feel like this approach won't work with tree-reuse/MCGS.
Yeah, that's a good point, all of the non-malloc solutions become quite a bit more complicated with MCGS. Tree reuse by itself isn't a big deal: a single-pass over the tree is enough to compact everything again.
I just remember being frustrated by running a long search (~1M nodes), extracting the result, and then having to wait a couple of seconds at the end just to free everything. That's not a big deal yet though, and certainly not the first priority to optimize!
I experienced the same frustration, and so I used to have a separate cleanup thread responsible for freeing nodes. When deleting a node, the main thread would just put it on a cleanup queue, that the cleanup thread would process asynchronously. But I removed this at some point, and I don't remember exactly why. Maybe there was a bad interaction with MCGS? In any case, I think this should be doable in principle.
All items listed here are finally completed!
Some notes:
Node
and edge_t
types. Each edge_t
has a policy probability scalar member.int
pool-index. This value is used to look up the Node
or edge_t
object from game-thread-specific object-pools.edge_t
has a lazily-instantiated child-node-index. Two or more edges (belonging to different parents) can share the same child-node-index (due to MCGS).
I've learned many things from discussions on various Discord channels, and from discussions with the author of the very similar kZero project. Here is a list of some of those things: