Restructure flow algorithms

Chillee commented 5 years ago

There's currently 3 flow implementations that (more or less) do similar things. There's HopcroftKarp, EdmondsKarp, PushRelabel, DFSMatching. There's also a 5th commonly used implementation (that KACTL doesn't include): Dinic's, although there is an issue for it #19 .

I've done a lot of benchmarking of flow algorithms, so let's break down these implementations.

DFSMatching: Runs in O(EV) time. Very short. Honestly, I think this is pretty useless nowadays. Maybe it used to be useful in the past, but nowadays Hopcroft-Karp is essentially expected for most matching problems.

HopcroftKarp: Only works on Bipartite Graphs, runs in O(\sqrt{E}V) time. Typically runs very fast (my guess is about 3x faster than Dinic's) on those bipartite graphs. Decently long.

EdmondsKarp: Runs in O(VE^2) time. Strictly inferior to Dinic's in runtime. Fairly short.

Dinic's: Runs in O(V^2E) time, O(VE log U) with scaling. Has some special guarantees: O(sqrt(E)*V) on Bipartite graphs, and O(min(sqrt(V),E^(2/3))*E) on unit graphs.

HLPP: Runs in O(V^2sqrt(E)) time. Runs the fastest in practice on most graphs. Current KACTL version is missing global relabeling optimization, which makes a huge difference (~2x on Chinese flow problem, allows for 0.15 on SPOJ MATCHING).

If we look only at complexities, we see that a combination of Dinic's + HLPP dominates everything else. Provable guarantees are nice, but flow problems are also notoriously difficult to come up with worst cases for. I have a HLPP implementation that beats out my Dinic's on all the test cases I've tried it on (and also has/tied for the #1 spot on SPOJ FASTFLOW, VNSPOJ FASTFLOW, and the chinese flow problem). It also performs about equal to KACTL's Hopcroft-Karp on MATCHING. So, it seems that in practice, that HLPP implementation would probably perform the best.

Thus, I propose that we do a couple things:

Add global relabeling to the current HLPP implementation. This will give us the fastest possible flow implementation in general.
Add Dinic's. This gives us a bunch of nice provable guarantees on certain kinds of graphs, a much more understandable/traditional augmenting-paths flow algorithm, with even stronger guarantees if scaling is added.
With these two, we have both practical efficiency and theoretical guarantees covered, making HopcroftKarp and EdmondsKarp redundant. We can then remove them.
Also remove DFSMatching since I think it's pretty useless.

This leaves us with 2 implementations: HLPP to be insanely fast in practice, and Dinic's to have good guarantees on certain kinds of graphs.

See https://codeforces.com/blog/entry/66006?#comment-500365 for some discussion about HLPP optimizations.

simonlindholm commented 5 years ago

EdmondsKarp: Runs in O(VE^2) time. Strictly inferior to Dinic's in runtime. Fairly short.

Yeah, the only reason this exists is code size. I've seen people at KTH use quite a lot. Not sure we should remove it if everything else is longer, though its slowness does scare me a bit every time I use it.

Add Dinic's. This gives us a bunch of nice provable guarantees on certain kinds of graphs, a much more understandable/traditional augmenting-paths flow algorithm, with even stronger guarantees if scaling is added.

I think we should probably do this, though it's annoying that it doesn't actually help in practice... (until it does, I guess.) Does "(my guess is about 3x faster?)" refer to comparison with Dinic's? Can we speed the Dinic's up? How does code size compare?

Also remove DFSMatching since I think it's pretty useless.

We shouldn't:

it's short
it's easier to use than general flow algorithms
it's sufficed for most matching problems that I've encountered
the particular way in which it creates a matching can be exploited, either for incremental matching (#33, https://code.google.com/codejam/contest/6314486/dashboard#s=p0 ), or for max-weight matching where w[i, j] = W[i] for all edges (i, j) (sort by decreasing W, do matching in that order).

Add global relabeling to the current HLPP implementation. This will give us the fastest possible flow implementation in general.

I skipped this before because it didn't make a big difference in my tests (maybe I got the heuristic wrong, or I my test cases were bad, I don't remember any details), and it added quite a bit of code size (while the gap heuristic only added 4 lines). But we could consider it...

The linked discussion is interesting, I certainly need to check out dacin21's mentioned bug with isolated source node...

Chillee commented 5 years ago

I skipped this before because it didn't make a big difference in my tests (maybe I got the heuristic wrong, or I my test cases were bad, I don't remember any details), and it added quite a bit of code size (while the gap heuristic only added 4 lines). But we could consider it...

Perhaps... I'll try and benchmark it.

Yeah, the only reason this exists is code size. I've seen people at KTH use quite a lot. Not sure we should remove it if everything else is longer, though its slowness does scare me a bit every time I use it.

I think Dinic's can be written in not significantly more code. I suspect people gravitate towards this one because the next step up is 15 lines and double the number of characters.

Also remove DFSMatching since I think it's pretty useless.

I suppose that's fair... perhaps we could write about how it can be extended to those problems you mentioned?

simonlindholm commented 5 years ago

I suppose that's fair... perhaps we could write about how it can be extended to those problems you mentioned?

We should probably do something about incremental matching (see linked issue), but for the sort-then-match-in-that-order technique I don't think that warrants mention in the description, since it doesn't require modification of the code, just the caller. (And there isn't space to KACTL teach you everything there is to know about matching/flow.)

kth-competitive-programming / kactl

Restructure flow algorithms #75