Closed lucywang000 closed 3 years ago
Hi @lucywang000, thanks for your detailed report. I was able to figure out the bottleneck and come up with a solution, see the pull request GH-136. Here is the time
difference between what is currently on epsilon
and what is on the gh-135
branch.
;; Old
(time
(dotimes [n 10000]
(m/find data
{:people (m/scan {:name ?name :id ?id :inactive true})
:addresses (m/scan {:person-id ?id :as ?address})}
{:name ?name
:found 1
:address ?address})))
; ;=> "\"Elapsed time: 121.856909 msecs\"\n"
;; New
(time
(dotimes [n 10000]
(m/find data
{:people (m/scan {:name ?name :id ?id :inactive true})
:addresses (m/scan {:person-id ?id :as ?address})}
{:name ?name
:found 1
:address ?address})))
;; => "Elapsed time: 21.000418 msecs"
Both solutions find
{:name "Alice",
:found 1,
:address {:type :vacation, :person-id 2, :info ""}}
and, as you can see, the new solution does so far more quickly. At the moment I don't have the opportunity to benchmark this properly because I need to get to bed but I believe the results should be better. That being said, Meander is still slightly slower than the handwritten Clojure you submitted. Using time
again as a rough measuring tool:
(time
(dotimes [n 10000]
(let [{:keys [people addresses]} data]
(->> people
(filter :inactive)
(some (fn [{:keys [name id]}]
(when-let [address (some (fn [{:keys [person-id] :as address}]
(when (= person-id id)
address))
addresses)]
{:name name
:found 1
:address address})))))))
;; => "Elapsed time: 13.877456 msecs"
For this particular example, the numbers tend to favor Clojure by 5 to 10ms. The reason for this slight performance penalty has to do with map?
type checks on each element of the search space produced by m/scan
e.g. every value in :people
and :addresses
. When I macroexpand the code and delete those type checks I get equivalent and occasionally better metrics than the handwritten example, about 1 to 2ms.
Any idea why such a huge difference?
The runtime search space produced and traversed for m/scan
patterns was doing more work than it needed to. Fixing this problem has been on my personal todo list but I haven't felt any pressure to make it a priority until today. So, I'm actually glad you submitted this ticket! 😄
If everything looks in order, and no one has any complaints or objections, I will merge the patch tomorrow and make a new release of the library.
Thanks @noprompt ! It's really awesome to see this performance boost.
When I macroexpand the code and delete those type checks I get equivalent and occasionally better metrics than the handwritten example, about 1 to 2ms.
Bravo!
@lucywang000 Not a problem. I'm going to close this ticket for now.
I think a good follow up to this would be to allow for a sort of "unsafe" notation which tells the compiler to skip a type check at runtime.
Hey, thanks for creating meander! I am planning to use meander in a recent project to replace hand crafted map/filter code, which I think would be make the code much better to read and reason about.
However when I'm doing some simple perf testing, it seems hand-crafted code could beat meander by 5-10x.
Any idea why such a huge difference? My understanding is that meander need to compile the LHS so it adds some overhead. But when I increase the data to dwarf the compilation overhead, the hand crafted code grows from 450ns to 500ns (10%), but the meander solution grows from 3us to 6us (2x).
My best guess is
m/scan
actually collects all the results, whilem/find
is just a syntax suger for(first (m/search (..))
. But this still could not explain the 5x difference in the original example. E.g. If it's a query on a hot path, is there any way to pre-compile the LHS part somehow?