Closed noprompt closed 5 years ago
@jimmyhmiller @xificurC I just applied a patch to make the performance of scan
, vscan
, and patterns which look similar e.g. [_ ... thing . _ ...]
perform better.
(time
(dotimes [_ 100000]
(r.match/find [1 2 3 4 5]
[_ ... 4 . _ ...]
:four)))
;; => :four
;; Old
"Elapsed time: 389.033882 msecs"
;; New
"Elapsed time: 61.076688 msecs"
This is still going to be slower than filter
, some
, etc. by an order of magnitude for these kinds of searches and it is still possible to improve this further by doing a bit more analysis. More to come! 😅
The
scan
andvscan
operators currently have no custom compilation. They are parsed as if one had written them asand
respectively. Thus they have no AST node of their own and are compiled through usual mechanisms for compiling seq and vector patterns. In this case, it results in less than optimal code for something which gives us enough information to do better.
Because we know
scan
andvscan
are looking for subsequences of a given length we can simply use lean on the Clojure standard library.For subsequences of length 1 we can use the target collection itself as the search space using
reduce
.For subsequences of length greater than one we could use
clojure.core/partition
for the search space.This will require the following:
scan
andvscan
like nodes with ascan-cat
function (below).op-search
.scan-cat
should look for something which resemblesand return
and
nil
otherwise. If we get a result then we can compile it usingop-search
/op-find
with the search spaces suggested. This should probably happen in the[true true]
clause of:prt
compilation.What's nice about this approach is that we can improve the "handwritten" versions as well.