alt-romes / hegg

Fast equality saturation in Haskell
https://hackage.haskell.org/package/hegg
BSD 3-Clause "New" or "Revised" License
75 stars 8 forks source link

Simple expression tree with just a few rules enters an infinite loop #17

Closed folivetti closed 2 years ago

folivetti commented 2 years ago

So I have implemented this minimal code that, when run using the latest hegg commit (and also the hackage version), hangs. It seems to create an infinite loop in the function compileToQuery (https://github.com/alt-romes/hegg/blob/master/src/Data/Equality/Matching.hs#L104) but I couldn't debug much after that. Here's the code:

https://gist.github.com/folivetti/d1c01dd5c34727b0250914406559fd23

What I could actually print from this is a sequence of rules that are triggered, oddly enough I couldn't see how some of them are matched:

https://gist.github.com/folivetti/4847faa72e48e50e0b004949c2790356

this list was generated by adding a trace at https://github.com/alt-romes/hegg/blob/master/src/Data/Equality/Saturation.hs#L161

trace (if (not.null) matches' then show rw_id else "") (map (lhs := rhs,) matches', newStats)

I'm really clueless on why that happens.

folivetti commented 2 years ago

quick question: are the rewrite rules bidirectional?

So if I have the rule "x" + 0 := "x" would it match both "x" + 0 and "x" and replace them with "x" and "x" + 0, respectivelly?

alt-romes commented 2 years ago

quick question: are the rewrite rules bidirectional?

So if I have the rule "x" + 0 := "x"

would it match both "x" + 0 and "x" and replace them with "x" and "x" + 0, respectivelly?

Quick reply: they aren't.

Though a wrapper for bidirectional rules should be simple enough (they desugar to two unidirectional rules)

I'll take a look at the rest later when possible :-)

alt-romes commented 2 years ago

Okay, @folivetti

I believe the infinite loop is due to something called explosive rules that leads to "exponential rules".

This is better described in https://docs.rs/egg/0.6.0/egg/struct.BackoffScheduler.html, a scheduler in the rust implementation that prevents explosive rules. We also have got one with the same name -- BackoffScheduler.

A RewriteScheduler that implements exponentional rule backoff.

For each rewrite, there exists a configurable initial match limit. If a rewrite search yield more than this limit, then we ban this rule for number of iterations, double its limit, and double the time it will be banned next time.

This seems effective at preventing explosive rules like associativity from taking an unfair amount of resources.

I found that by slightly tweaking the parameters to the scheduler I got your example to terminate.

Previously these parameters were hardcoded, but I've pushed a change to master https://github.com/alt-romes/hegg/commit/b76052679528a7015bfc7cb0c13b8ec5d5f6499a that now allows backoff scheduler parameters to be tweaked.

I've added your example as a unit test in which I configure the parameters in order for the test to terminate (fast). You can check it out here: https://github.com/alt-romes/hegg/blob/b76052679528a7015bfc7cb0c13b8ec5d5f6499a/test/T1.hs#L141


I think you have a combination of rules that provokes this exponential rule usage, in particular the way the associative division rule you added interacts with the others (you might notice that in test/Sym.hs the set of rules doesn't include that one however it's still derivable from the ones I do include).

As a sign off note on this issue I'd love to hear what @mwillsey has to add to what I've said. He's definitely run into more of these exponential rule issues and investigated them more deeply, he might also be able to identify it more easily here. This bit is quite complicated and I'm sure I haven't got it down 100%.

We can then close this issue!

mwillsey commented 2 years ago

Both actually infinite loops and "just" till the heat death of the universe loops can occur in equality saturation. Associativity and commutativity are prone to the latter, and that is indeed the motivation for BackoffScheduler (although it can sometimes be more trouble than it's worth; more work needs to be done here!).

See here for a previous insightful discussion: https://github.com/egraphs-good/egg/discussions/60

folivetti commented 2 years ago

Thank you very much for the prompt solution and detailed explanation :)

Back to study the code :)