egraphs-good / egglog

egraphs + datalog!
https://egraphs-good.github.io/egglog/
MIT License
400 stars 45 forks source link

Inefficient desugaring for `rewrite` #350

Open ezrosent opened 6 months ago

ezrosent commented 6 months ago

We desugar code like (rewrite (Add x y) (Add y x)) into something like:

(rule ((= e (Add x y))) ((union (Add y x) e)))

But using union is only needed if the right-hand side of a rewrite is an identifier. When it's an expression, we could do:

(rule ((= e (Add x y))) ((set (Add y x) e)))

This should be safe, but it gives egglog less work to do: if (Add y x) is in the database already, setting it will call union anyway as part of the merge process. If it isn't we skip creating a new noncanonical id that rebuilding has to clean up.

On the following program, this is a significant speedup:

(datatype Num (N i64) (Add Num Num))
; instead of (rewrite (Add a b) (Add b a))
(rule ((= e (Add a b))) ((set (Add b a) e)))
; instead of (rewrite (Add a (Add b c)) (Add (Add a b) c))
(rule ((= e (Add a (Add b c)))) ((set (Add (Add a b) c) e)))

(let _lit0 (N 0))
(let _lit1 (N 1))
(let _lit2 (N 2))
(let _lit3 (N 3))
(let _lit4 (N 4))
(let _lit5 (N 5))
(let _lit6 (N 6))
(let _lit7 (N 7))
(let _lit8 (N 8))
(let _lit9 (N 9))
(let _lit10 (N 10))
(let _lit11 (N 11))
(let _v0 (Add _lit0 _lit1))
(let _v1 (Add _lit2 _v0))
(let _v2 (Add _lit3 _v1))
(let _v3 (Add _lit4 _v2))
(let _v4 (Add _lit5 _v3))
(let _v5 (Add _lit6 _v4))
(let _v6 (Add _lit7 _v5))
(let _v7 (Add _lit8 _v6))
(let _v8 (Add _lit9 _v7))
(let _v9 (Add _lit10 _v8))
(let _v10 (Add _lit11 _v9))
(let _r0 (Add _lit11 _lit10))
(let _r1 (Add _r0 _lit9))
(let _r2 (Add _r1 _lit8))
(let _r3 (Add _r2 _lit7))
(let _r4 (Add _r3 _lit6))
(let _r5 (Add _r4 _lit5))
(let _r6 (Add _r5 _lit4))
(let _r7 (Add _r6 _lit3))
(let _r8 (Add _r7 _lit2))
(let _r9 (Add _r8 _lit1))
(let _r10 (Add _r9 _lit0))
(run 10000)
(check (= _r10 _v10))

This program takes 22.5 seconds to saturate on my machine. The usual rules for associativity and commutativity take 29s.

oflatt commented 6 months ago

Unfortunately using set on datatypes is going to be disallowed with the semantics we decided on recently. Perhaps query compilation can recognize this optimization later in the pipeline?

ezrosent commented 6 months ago

I thought set would still be allowed, we just don't get lookups after a set?

oflatt commented 6 months ago

set should be allowed for functions with custom merge functions, but constructors like Add should not be set, only unioned