Open arr28 opened 9 years ago
We're trying to fetch a node with NULL_REF. Comments on the call stack...
CappedPool.get
- We're looking at array index -1 in the pool. Oops.TreeNode.get(CappedPool, long)
- We would have asserted xiNodeRef != NULL_REF
if assertions were enabled. (NULL_REF
is -1.)TreeNode.get(long)
just passes through to (2), using the same pool as the pool used for this node. I think we only have multiple pools when we have multiple factors and I don't think Reversi is considered to have multiple factors.TreeNode.checkChildCompletion
. Okay, so here's where we must be going wrong. This is the code... // No need to process hyper-edges to determine completion
if ( edge.isHyperEdge() )
{
break;
}
if (edge.getChildRef() == NULL_REF)
{
allImmediateChildrenComplete = false;
}
else if (get(edge.getChildRef()) != null)
{
TreeNode lNode = get(edge.getChildRef());
<...>
We're failing in the else if
.
Have we hit a race condition where the edge has been freed between the check edge.getChildRef() == NULL_REF
and the subsequent attempt to get the edge? Is that even something that can happen?
If so, the fix is simply to pull out the result of getChildRef
into a variable and then use that. Then check for the same pattern elsewhere in the code to find similar bugs.
Only one thread should be manipulating the tree so such a race condition should not be possible! If it is then we have a bug in our threading and likely need to chnage things around (or add locking, which we'd rather avoid).
In particular all completion, trimming etc. is done on the main search thread for exactly this reason. Perhaps we can add some debug code that stores the thread id of the search thread in a tree member and asserts it matches the current thread in node/edge alloc/frees as a way to debug this...
Yeah - I realised that whilst I was out at lunch.
However, unless the code has changed and the code I'm looking at doesn't match the stack, it must be what happened. The other functions are so simple that nothing else should have happened.
In the absence of a repro, not realistic to think that I'm going to get to the bottom of this before the competition. Also, since it has only been seen once, not especially important.
When playing this Reversi match...