noprompt / meander

Tools for transparent data transformation
MIT License
918 stars 55 forks source link

Performance issue with compiling a large Maybe pattern #234

Closed xiongtx closed 2 years ago

xiongtx commented 2 years ago

Problem

When matching maps with keys whose value could be nil or maps themselves, we use the Maybe pattern. E.g.

(defn extract-optional-simple
  [m]
  (m/match m
    {:a (m/or (m/and nil ?a1) {:a1 ?a1})} ?a1))

When dealing w/ a large number of optional values, however, performance of compiling the function degrades rapidly. The following takes nearly 10s to evaluate on my MacBook Pro:

(defn extract-optional-bad
  [m]
  (m/match m
    {:a (m/or (m/and nil ?a1) {:a1 ?a1})
     :b (m/or (m/and nil ?b1) {:b1 ?b1})
     :c (m/or (m/and nil ?c1) {:c1 ?c1})
     :d (m/or (m/and nil ?d1) {:d1 ?d1})
     :e (m/or (m/and nil ?e1) {:e1 ?e1})
     :f (m/or (m/and nil ?f1) {:f1 ?f1})
     :g (m/or (m/and nil ?g1) {:g1 ?g1})
     :h (m/or (m/and nil ?h1) {:h1 ?h1})
     :i (m/or (m/and nil ?i1) {:i1 ?i1})
     :j (m/or (m/and nil ?j1) {:j1 ?j1})}
    [?a1 ?b1 ?c1 ?d1 ?e1 ?f1 ?g1 ?h1 ?i1 ?j1]))

The performance of the function itself is fine, it just take a long time to compile.

References

noprompt commented 2 years ago

@xiongtx Thanks for the report. I am sorry this takes a while to compile. This is totally due to how m/or is compiled. This is likely generating a lot of code. Even when I try to macro expand the code, Cider times out.

Alternatively, you could try the following.

(defn extract-optional-bad
  [m]
  (vec
   (m/search m
     {:a (m/or (m/and nil ?a1) {:a1 ?a1})} ?a1
     {:b (m/or (m/and nil ?b1) {:b1 ?b1})} ?b1
     {:c (m/or (m/and nil ?c1) {:c1 ?c1})} ?c1
     {:d (m/or (m/and nil ?d1) {:d1 ?d1})} ?d1
     {:e (m/or (m/and nil ?e1) {:e1 ?e1})} ?e1
     {:f (m/or (m/and nil ?f1) {:f1 ?f1})} ?f1
     {:g (m/or (m/and nil ?g1) {:g1 ?g1})} ?g1
     {:h (m/or (m/and nil ?h1) {:h1 ?h1})} ?h1
     {:i (m/or (m/and nil ?i1) {:i1 ?i1})} ?i1
     {:j (m/or (m/and nil ?j1) {:j1 ?j1})} ?j1)))

This has the same semantics, compiles more quickly, and generates far less code.

xiongtx commented 2 years ago

Cool, per discussion on Clojurians I think the main effort should be on meander.zeta, so I'll close this.