Tarmean commented 6 months ago

I have been playing with alternative e-matching implementations and tried comparing my versions with the existing matching. Largely the differences are fairly minor, but dumping all rule triggers for the test suite showed some interesting differences:

Currently the E-Matching builds a database from the hashcons structure. In test/Sym.hs the analysis removes other E-Nodes if their E-Class has a constant representation, but the hashcons still contains those E-Nodes so the rules still trigger on them
Neither the hashcons structure nor the class node-lists are canonicalized/deduplicated. Afaict the only time the E-Graph deduplicates is when nodes are reinserted because a child class merged, and only if the reinserts happen during the same worklist step.
E-Matching can multiply these duplicates out. This doesn't affect performance, but mostly because the scheduler counts these duplicates as extra matches and blocks the affected rules earlier.

Not sure if those are critical issues, but the scheduler blocking extra rules makes performance comparisons tricky. And disabling the scheduler seems to cause occasional exponential slowdowns for the existing genericJoin, I haven't figured out if its something about the join algorithm or if the duplicates actually cause exponentially more results.

alt-romes commented 6 months ago

Thanks for reporting this careful analysis @Tarmean.

I am wary of these exponential slowdowns, but it would be very good to have a minimal reproducer example to investigate further. @folivetti may be running into this, but will hopefully let me know if in his investigation it turns out to be a problem with the system (and not with hegg).

folivetti commented 6 months ago

edit: never mind again, this works without my height calculation from the other issue. Sorry about the noise.

Tarmean commented 5 months ago

I've been trying two things, so I'm gonna split this into two parts in the hopes it remains somewhat comprehensible:

Testing performance

It's not properly self-contained but I have been using the test cases as benchmarks. Here is a hacked-together version. https://github.com/alt-romes/hegg/compare/master...Tarmean:hegg:exponential_example

This limits the iterations to 8, and disables scheduling.

When I run test.exe -p i3:

When calling normClasses before rebuild the test finishes after ~53s
When not calling normClasses, it OOMs after ~13 minutes

But normClasses is 30% slower for smaller rewrites! A workaround would to count merges, and only clean up if there have been >n merges since the last time. But maybe it is simpler to leave it be since it only becomes relevant with big examples like 50s of non-scheduled rewriting?

General slowdown

Maybe there is a second thing going on. Even with normClasses the cleanup the test is 1s for 6 iterations, 4s for 7 iterations, and 53s for 8 iterations.

My understanding of the current algorithm is roughly:

for var_1 in gather_var("a", {}):
    for var_2 in gather_var("b", {"a": var_1}):
           for var_3 in gather_var("c", {"a": var_1, "b": var_2}):
                ....

Where gather_var loops over the database and finds variable assignments which are possible for all pattern constraints with the existing substitutions. If the DB size increases, this slows down gather_var. If gather_var returns more items, the nested gather_var is called more frequently. So if they scale together this maybe causes quadratic behavior?

Tbh I wasn't smart enough to fully understand the explanation in the relational e-matching paper. I may be totally off with my understanding.

I found the implementation strategy of this paper interesting. It can freely move between generic join and hash join. But it builds an index for each atom in the query! Even when sharing indexes with compatible variable orderings across all rewrite rules the index building could dominate the runtime for small databases. I have an experimental implementation of this here but the code is in serious need of a cleanup https://github.com/Tarmean/hegg/blob/master/src/Data/Equality/Compiler/QueryExecution.hs#L73 .

alt-romes / hegg

E-Matching Non-Canonical Nodes #34

Testing performance

General slowdown