Improve BFS trace selection efficiency

moralismercatus commented 9 years ago

Every time a trace is requested, the BFS algorithm starts from node 0 and traverses downward. It's inefficient to start from node 0 every time.

One improvement is to start from the last node that was searched. As new traces are inserted into the graph that may be closer to the root, that case must be accounted for.

Other improvements include only doing a BFS once every so many iterations and making a queue of traces. Or, continuously having a thread do BFS and only refresh the queue when the search is done, so as to not make the selection algorithm wait.

moralismercatus commented 9 years ago

Be careful with vanilla BFS. Whatever you make the starting vertex, it will treat that vertex as the root, so it won't "continue" BFS as if you were starting where you left off.

It's likely you'd have to implement your own BFS algorithm that does "continue" from a particular vertex after each iteration and insertion. I believe this can be done via making the "color" a member of VertexProperty. Thus, the color would be perserved over consecutive iterations.

moralismercatus commented 9 years ago

I'm also concerned about the algorithm itself (perhaps this belongs in another ticket).

It's not "true" BFS algorithm because its reliance on depth information is incorrect. Normal BFS relies on two attributes. One is the depth from the root (as normal BFS) and the other is when the node has been visited.

Our BFS algorithm relies on three attributes. Depth from root, visitation, and execution. It's the fact that there are nodes in between branches that throws the algorithm off, I think. If every node in the graph represented a branch, then it would be fine, I think. But as the depth from the root is offset by these intermediary nodes, upon observation, I believe we get stuck in the very kind of situations this was suppose to avoid.

For example, say very early in execution, the is a loop. Any other branches are much further down. Thus, the branches from this loop will continue to be selected until their depth finally competes with those other branches not in the loop. Isn't this the definition of a "local minimum" - the very thing I was trying to avoid?

It seems that there should be, conceptually, two graphs. One containing all nodes to uniquely identify a trace, and another with only the branching nodes. That is, construct the graph as currently, then go through and remove all intermediary nodes between branching nodes. This graph can then be used to do the BFS on.

Of course, the above algorithm may not be the most efficient. With a custom BFS algorithm, a single graph could be used, but the algorithm could be trained to somehow ignore intermediary nodes.

moralismercatus commented 9 years ago

Note about previous comment. This issue has been resolved. Nodes between branches are ignored during BFS.

moralismercatus / crete

Improve BFS trace selection efficiency #93