Why is fast-greedy-dag worse in terms of dag size?

oflatt commented 10 months ago

Does anyone know why faster-greedy-dag is less good than greedy-dag?

cumulative dag cost for faster-greedy-dag: 77030
cumulative dag cost for greedy-dag: 76907

oflatt commented 10 months ago

@TrevorHansen I've been reading it over and it's really neat- reminds me of how congruence closure keeps track of parent pointers in a traditional implementation

oflatt commented 10 months ago

One possibility is that the order of traversing the egraph matters because it is greedy, and so the algorithm happens to work less well

Edit: I am guessing this is what is happening! faster-greedy-dag considers strictly fewer programs than greedy-dag does, since it traverses bottom up. In this sense, it is slightly more greedy.

This is the first evidence I've ever seen that greedy dag leaves performance on the table.

TrevorHansen commented 9 months ago

Hi @oflatt

I don't have a good intuition for the extractors yet. My understanding is that we'll get optimal DAG extraction when we're extracting from trees with the greedy-dag algorithms. But for DAGs and cyclic graphs the greedy approaches won't necessarily be optimal. The integer-linear-programming extractor (in #16) is the only one I've seen that will produce optimal DAG extractions given enough time. I wrote some background about how I understand it in: https://github.com/egraphs-good/extraction-gym/pull/16#issuecomment-1800348247

I need to make some small examples and experiment with the extractors before I can say any of this with confidence though.

Regards the faster-greedy-dag and greedy-dag returning different results. I guessed that they sometimes get to different non-optimal fixed-points, so give different results. When I look at the output, sometimes one is better, sometimes the other is better:

###################################################
faster-greedy-dag vs greedy-dag

extractors: ['faster-greedy-dag', 'greedy-dag']
data/babble/list_list_hard_test_ellisk_2019-02-15T11.26.41--bench007_it10.json  differs in dag cost:  418 419
data/babble/list_list_hard_test_ellisk_2019-02-15T11.26.41--bench010_it15.json  differs in dag cost:  426 425
data/babble/physics_scientific_unsolved_4h_ellisk_2019-07-20T18.05.46--bench003_it3.json  differs in dag cost:  172 171
data/babble/list_list_hard_test_ellisk_2019-02-15T11.43.28--bench011_it19.json  differs in dag cost:  363 364
data/babble/physics_scientific_unsolved_4h_ellisk_2019-07-20T18.13.12--bench001_it1.json  differs in dag cost:  145 144
data/babble/physics_scientific_unsolved_4h_ellisk_2019-07-20T18.20.13--bench001_it1.json  differs in dag cost:  156 155
data/babble/list_list_hard_test_ellisk_2019-02-15T11.43.28--bench007_it7.json  differs in dag cost:  299 298
data/babble/physics_scientific_unsolved_4h_ellisk_2019-07-20T18.09.34--bench000_it0.json  differs in dag cost:  91 89
data/babble/list_list_hard_test_ellisk_2019-02-15T11.43.28--bench005_it5.json  differs in dag cost:  232 231
data/babble/list_list_hard_test_ellisk_2019-02-15T11.43.28--bench010_it17.json  differs in dag cost:  380 381
data/babble/list_list_hard_test_ellisk_2019-02-15T11.31.39--bench010_it16.json  differs in dag cost:  346 345
data/babble/list_list_hard_test_ellisk_2019-02-15T11.43.28--bench008_it8.json  differs in dag cost:  329 330
data/babble/list_list_hard_test_ellisk_2019-02-15T11.39.19--bench004_it5.json  differs in dag cost:  234 235
data/babble/list_list_hard_test_ellisk_2019-02-15T11.31.39--bench006_it6.json  differs in dag cost:  247 246
data/babble/list_list_hard_test_ellisk_2019-02-15T11.43.28--bench009_it13.json  differs in dag cost:  392 393
data/tensat/nasneta_acyclic.json  differs in dag cost:  16.301627096203447 16.365816096737035
data/tensat/nasneta.json  differs in dag cost:  16.322293040515433 16.374893040569077
data/tensat/nasrnn.json  differs in tree cost:  1044.7145473615237 930.8220652824966
data/tensat/nasrnn.json  differs in dag cost:  0.9633390190010687 0.9633100190003461
data/egg/math_simplify_add.json  differs in tree cost:  3 7
data/egg/math_diff_ln.json  differs in tree cost:  3 4
data/rover/box_filter_3iteration_egraph.json  differs in dag cost:  1819 1701
data/rover/fir_8_tap_5iteration_egraph.json  differs in tree cost:  5936 5965
data/rover/fir_8_tap_8iteration_egraph.json  differs in tree cost:  5936 5965
data/rover/fir_8_tap_7iteration_egraph.json  differs in tree cost:  5936 5965
cumulative time for faster-greedy-dag: 3229ms
cumulative time for greedy-dag: 37459ms
cumulative tree cost for faster-greedy-dag: 32017750409008
cumulative tree cost for greedy-dag: 32017750408986
cumulative dag cost for faster-greedy-dag: 77029
cumulative dag cost for greedy-dag: 76907
Cumulative time for faster-greedy-dag: 3229ms
Cumulative time for greedy-dag: 37459ms
faster-greedy-dag / greedy-dag
geo mean
tree: 0.9960
dag: 1.0004
micros: 0.7220
quantiles
tree:   0.4286, 1.0000, 1.0000, 1.0000, 1.1224
dag:    0.9957, 1.0000, 1.0000, 1.0000, 1.0694
micros: 0.0210, 0.5425, 0.9891, 1.0744, 2.7667

###################################################

oflatt commented 9 months ago

Thanks for your answer! I think you are right. It would be cool if we could write a heavier-weight algorithm that tries more combinations! Something that isn't as slow as ILP but finds better solutions than greedy-dag

Bastacyclop commented 9 months ago

I played around with some kind of Dijkstra-inspired implementation a while back: https://github.com/Bastacyclop/egg/blob/dag-extract/src/dag_extract.rs

On small examples it gets better results than the ILP extraction, which is not optimal because it greedily removes cycles (https://github.com/egraphs-good/egg/issues/207, https://github.com/egraphs-good/extraction-gym/pull/7), and I think also gives up by returning pretty bad approximations sometimes? However on medium-big examples the complexity of my code blew up.

Although Dijkstra is O(n²) / O(nlogn), with my problem mapping the graph explored has a node for every member of the powerset of eclasses, so everything turns into a nasty exponential complexity. I think it would be really cool to push this idea further by adding optimizations like branch and bound, using better data structures in the implementation, digging more into other graph algorithms, and maybe even borrowing some optimizations from ILP solvers.

TrevorHansen commented 9 months ago

I think it's a good idea to explore heuristic/non-optimal extractors. I suspect that for problems that we care about, even with lots of work we won't be able to get the optimal dag-extractors fast enough to be practical. For me though, that work will come after the optimal dag-cost extraction is improved.

Going back to what I wrote before - I've realised I was confused. I now think that the bottom-up extractors give an optimal extraction for tree-cost, and the ILP-based extractors (in #30 , #16) - when they're set with infinite timeout, give an optimal extraction for dag-cost.

In the short term, I'm focused on getting #16 working properly. It's much faster than #30 already, I'd guess >10x, with a few more options still to speed it up.

@Bastacyclop are the egraphs you'd like dag-cost extractions for in #18? If so, the hold up on merging looks like it's just providing a readme.md. I can do it if you'd like?

Bastacyclop commented 9 months ago

@Bastacyclop are the egraphs you'd like dag-cost extractions for in #18? If so, the hold up on merging looks like it's just providing a readme.md. I can do it if you'd like?

Yes, at least a sample. I added the README.

TrevorHansen commented 9 months ago

Yes, at least a sample...

Thanks. I took a quick look. In case you're still working on this, the sample flexc egraphs now seem easy for the updated extractors:

extractors: ['faster-greedy-dag', 'faster-ilp-cbc']
cumulative time for faster-greedy-dag: 258ms
cumulative time for faster-ilp-cbc: 1452ms
cumulative tree cost for faster-greedy-dag: 4508
cumulative tree cost for faster-ilp-cbc: 4565
cumulative dag cost for faster-greedy-dag: 1181
cumulative dag cost for faster-ilp-cbc: 1181
faster-greedy-dag / faster-ilp-cbc
geo mean
tree: 0.9789
dag: 1.0000
micros: 0.2219
quantiles
tree:   0.9485, 0.9485, 0.9856, 1.0000, 1.0000
dag:    1.0000, 1.0000, 1.0000, 1.0000, 1.0000
micros: 0.0509, 0.1380, 0.2159, 0.3596, 0.6663

So we get optimal extractions from the faster-ilp-cbc-timeout extractor in about 110ms per egraph.

egraphs-good / extraction-gym

Why is fast-greedy-dag worse in terms of dag size? #19