When our algorithm takes a step, it may update more than a single cell in the costmap. If we are to compare the cost of n branches m steps ahead, we must do so in parallel. The bruteforce solution would be to keep a separate copy of the costmap for every possible sequence of actions, taking O(n^m) memory. However, we could instead keep just m costmaps if we are able to backtrack our sequence of actions up to the current state vertex, allowing for a depth-first score calculation.
Furthermore, if it were easy to implement the inverse of our exploration algorithm, we could avoid even the O(m) memory cost by "playing back" previous actions from our exploration graph. Unfortunately, for any nontrivial algorithm that updates more than just a single cell, we may as well implement a symbolic manipulation engine to do the job for us. If memory becomes an issue, we can deal with it later and just use fewer lookahead steps.
Our code may then look something like this:
DecisionGraph g
CostMap cm(sourceFile, numRows, numCols, numLookahead)
bestNode = (0, 0) # (Node index, score)
score = 0
# node: Decision node
# totalScore: Running total score to compare later
# cm: Pointer to CostMap
# depth: Recursion depth
define calcScore(node, totalScore, cm, depth):
node.score = totalScore + someAlg(node, cm) # This will update cm.
numChildren = 0
# Update best node
if (node.score > bestNode[1]):
bestNode = (node, node.score)
# Find all children
if depth < maxDepth:
for child in node.children:
numChildren += calcScore(child, node.score, cm, depth+1)
cm.undo(1) # Undo one step.
return numChildren
define findBestPath:
g.add(startNode)
currentNode = startNode
while(calcScore(currentNode, score, cm, 0) > 0):
pruneNode(currentNode, bestNode) # Prune all nodes excluding bestNode
currentNode = bestNode
return score
print findBestPath()
When our algorithm takes a step, it may update more than a single cell in the costmap. If we are to compare the cost of n branches m steps ahead, we must do so in parallel. The bruteforce solution would be to keep a separate copy of the costmap for every possible sequence of actions, taking O(n^m) memory. However, we could instead keep just m costmaps if we are able to backtrack our sequence of actions up to the current state vertex, allowing for a depth-first score calculation.
Furthermore, if it were easy to implement the inverse of our exploration algorithm, we could avoid even the O(m) memory cost by "playing back" previous actions from our exploration graph. Unfortunately, for any nontrivial algorithm that updates more than just a single cell, we may as well implement a symbolic manipulation engine to do the job for us. If memory becomes an issue, we can deal with it later and just use fewer lookahead steps.
Our code may then look something like this: