Open joinr opened 4 years ago
Thinking about this and #9 , what do you think will be the effects of missed branch predictions on this code? Can we engineer a test case where branch prediction isn't possible to test its effects?
I'm no expert on that, let alone knowing how engineer good degenerate cases. I guess you would have random input over the cases....no idea beyond that.
I ran the following interesting test:
(defn foo [k]
(fast-case k :a 0 :b 1 :c 2 :d 3 :e 4 :f 5 :g 6 :h 7 :i 9 :j 10 :k 11 :l 12 :m 13 :n 14 :o 15 :p 16 :q 17 :r 18 :s 19 :none))
(defn bar [k]
(case k :a 0 :b 1 :c 2 :d 3 :e 4 :f 5 :g 6 :h 7 :i 9 :j 10 :k 11 :l
12 :m 13 :n 14 :o 15 :p 16 :q 17 :r 18 :s 19 :none))
(def ks [:a :b :c :d :e :f :g :h :i :j :k :l :m :n :o :p :q :r :s :none])
(def lots (repeatedly 10000 #(rand-nth ks)))
(do
(c/bench (doseq [k lots] 0))
(c/bench (doseq [k lots] (foo k)))
(c/bench (doseq [k lots] (bar k))))
Overhead: 571.822473 µs Fast case: 654.098578 µs, minus overhead 85 Core case: 740.666878 µs, minus overhead 170 => fast case is 2x faster, amortized .
Nice. I tend to use case quit a bit, as does some clojure internals like defrecord, so this seems generally useful.
I wander how critical it actually is. While identity case checks are 2x faster we're still talking about 4 vs 8 ns in the worst case scenario, not to mention I did not measure these influences in a wider context of a running application. Perhaps cache misses will degrade other parts' characteristics, so I'm still wary. It would be interesting to get a profile of a typical application and how many times it cases on keywords.
If you're using keyword access for records, it will impact your performance. Ended up being a enough in the IFCPC optimization case (where I first ran into this, and defrecord's weaker performance compared to arraymaps..) that I found out about this. If you're not using idiomatic paths that leverage case
on hot paths, it may not matter then. That can probably be applied to most of these optimizations though (e.g. context and profiling matter).
Is there a need to introduce the binding to ge
and type hint it as an Object? I fail to see the ratilnale behind it. Am I missing anything?
That came from the original source in clojure.core/case. Unsure.
I ran across this a while back, when optimizing for some identity-based comparisons. It reared its head again when I was optimizing clojure's defrecord during the icfpc 2019 optimization exercise, specifically based on how clojure.core/defrecord handles its implementations of valAt, assoc, and a (lack of) IFn lookup style implementation. I'll post the alternative I'm working with (fastrecord) in another issue. Here though are some optimizations for clojure.core/case that leverage efficient identical? checks in the case where you have all keywords.
clojure.core/case already internally optimizes its lookup scheme based on the test constants. I haven't dug into the int test case (I assume it's optimal, why not). The other cases are if you have keyword test cases, or otherwise structural hasheq constants that can be tested by hashing.
In the case of records, clojure uses clojure.core/case to dispatch based on the keyword associated with the field for many operations, which will go down the
identical?
lookup path. This is good in general, with a substantial caveat. For test case cardinality <=20, it is empirically faster to do a linear scan through the test cases and checkidentical?
rather than the scheme clojure.core/case uses, which is to look up in a map (I "think" a persistent one, not sure).This means, for small sets of test constants, you always pay the price of lookup into the hash. You also eschew any possible short circuiting opportunities which may arise by design or naturally from the data.
My alternative is twofold: a macro that establishes a case-like form specific to
identical?
cases (e.g. keywords, or anything else the caller is confident thatidentical?
is appropriate for):and a drop-in replacement for clojure.core/case,
fast-case
, which detects conditions where it would be better to use linear scans rather than hashing and lookups:Up to values of 20, looking up the 20th value is still faster than O(1) hashing via clojure.core/case. Fewer values have far more substantial gains (around 2-3x for small test sets). For code that's on a hot path (like the internal implementations of clojure.core/defrecord and its lookups in valAt, assoc, etc., which dispatch to clojure.core/case off of keyword identity), this can provide substantial performance savings (which is the intent of records and similar performance accelerators after all).
[edit] added a default case that doesn't flip out on nil for
case-identical?