phate / jlm

GNU Lesser General Public License v2.1
44 stars 14 forks source link

Perf: Avoid visiting nodes for a "second" time multiple times in TarjanSCC #484

Closed haved closed 1 month ago

haved commented 1 month ago

This does not affect correctness, but should boost performance by avoiding unnecessary visits of nodes that are already parts of finished SCCs.

When implementing DFS using a stack, I had forgotten that the same node can be added multiple times before it actually gets visited. In a graph like

A -------> B
 \         ^
  --> C --/

Visiting node A will push B and C. Since we then visit C first, it will also have to push B to process B before C can be popped. The edge A->B should then be ignored when popping B the second time.

haved commented 1 month ago

The change you suggest is what I did first, but I still need to pop the element. One alternative would be to break it into if/elseif/else, which might be nicer.

Regarding unit testing it, it can be done with a custom callback, that checks that each node is only queried about its successor twice. I can absolutely do that

haved commented 1 month ago

I added a test and changed the code to avoid continue