noprompt / meander

Tools for transparent data transformation
MIT License
921 stars 55 forks source link

Improve performance of scan patterns #136

Closed noprompt closed 3 years ago

noprompt commented 3 years ago

This patch improves match performance for patterns of the form

_ ... p_1 ,,, p_n . _ ...

by introducing an AST rewrite rule to detect and replace them with a new type of node which can be compiled to more efficient code. This appears in meander.match.syntax.epsilon/expand-prt in Clojure and is semantically equivalent to the following Meander rewrite expression

    (m/rewrite
      {:tag :prt
       :left {:tag :drp}
       :right {:tag :prt
               :left {:tag :cat, :as ?cat}
               :right {:tag :drp}}}
      {:tag :meander.match.syntax.epsilon/subsequence
       :cat ?cat})

The performance increase for patterns of the form

    _ ... p . _ ...

see dramatic performance improvement as they no longer use meander.util.epsilon/partitions as the search space and, instead, simply use the input sequence itself. Thus, the resulting code is much more like a typical filter.