Closed joinr closed 4 years ago
This is indeed old news, but a great catch :)
It was one of the first implementations I tackled. Later found an even faster option of invoking the map on its argument and nil checking it, called get-some-in
.
see:
https://github.com/bsless/clj-fast/blob/master/src/clj_fast/core.clj#L103
and
https://github.com/bsless/clj-fast/blob/master/src/clj_fast/core.clj#L113
I was wondering why the inline macro version was on par with the direct method invocation (like a few ms slower for 10^6 calls with some variance), then I saw you're using the function invocation version of the map which is ultimately converted to a direct method invocation of .invoke from IFn, which dispatches to .valAt as .get would. Very nice, and no need for hinting. I stripped out the redundant nil?
checks in favor of using the built-in predicate from if
, and it didn't make a dent (in the small 3-deep map case).
I also noticed that inline-get-some-in
macroexpands to a let binding with all of the keys bound, regardless of their presence or not. I would've figured that using the conditional evaluation of if
to short circuit even the binding in lieu of let
would be beneficial, but not so much (again for the 3-deep case, seems to hold solid for 8 deep, I'd assume even more). I used to recall some minor overhead of introducing additional let bindings, but doesn't appear to manifest here.
Lol, that function application scheme is like 7x faster for an 8-deep map than clojure.core/get-in, 10x for a 3deep map (strange!) in criterium
You can check the benchmarks for get
which compare valAt
with a direct invoke. any difference between them is most likely noise. I first caught on to it being a viable option when I profiled all the ways to get
.
nil check would be more correct in the case you get false
out of the map. only nil is really nil, and I wanted to emulate some->
as closely as possible. There's also some details with let bindings just introducing another variable in the stack or something, you can disassemble the bytecode and see the exact differences, but at some point I just trust to the JIT that I can leave my code slightly more readable and it'll take care of the rest.
How did you run the test where get-some-in
was that much faster than core's? My tests only show it to be about 2-3 times faster
https://github.com/bsless/clj-fast/blob/benchmarks-rewrite/doc/results.md#get-in
nil check would be more correct in the case you get false out of the map. only nil is really nil,
Yeah, but in the case of lookup, getting anything other than a function (in this case) is semantically equivalent. If I lookup a value and get false
, I still can't continue to the next key. There's another layer of assumptions about the values being unpacked actually being associative so that you can tell when to short circuit (e.g. if I have nested value in {:a {:b true}}, and I try to invoke b's assoc'd value to maybe lookup a c key.
Also, another note about using the function invocation over .get
, while it's fast, it also assumes you have IFn objects all along the lookup path. This won't work with java.util.Map instances, like mutable hashmaps and concurrent maps. Just a minor difference to be aware of in the semantics....it's really focused on clojure objects implementing IFn for a lookup. clojure.core/get handles this gracefully. .get or .valAt would make it more formal (the JVM would throw an error related to map stuff, vs a more opaque function invocation error).
user> (let [m {:a {:b {:c {:d {:e {:f {:g {:h 8}}}}}} :other 2}}] (get-in m [:a :other :c :d :e]))
nil
user> (let [m {:a {:b {:c {:d {:e {:f {:g {:h 8}}}}}} :other 2}}] (inline-get-some-in m [:a :other :c :d :e]))
Execution error (ClassCastException) at user/eval13169 (form-init8466648854099771199.clj:242).
java.lang.Long cannot be cast to clojure.lang.IFn
user> (let [m {:a {:b {:c {:d {:e {:f {:g {:h 8}}}}}} :other 2}}] (inline-get-in m [:a :other :c :d :e]))
nil
My tests only show it to be about 2-3 times faster
I just re-ran it on a work computer, getting similar results (2-3x). I may have misread this morning when I was plinking at the REPL (or pushed a short-circuiting test through):
user> (let [m {:a {:b {:c {:d {:e {:f {:g {:h 8}}}}}}}}] (c/quick-bench (get-in m [:a :b :c :d :e :f :g :h])))
Evaluation count : 3399372 in 6 samples of 566562 calls.
Execution time mean : 176.080175 ns
Execution time std-deviation : 0.890069 ns
Execution time lower quantile : 174.981368 ns ( 2.5%)
Execution time upper quantile : 176.979590 ns (97.5%)
Overhead used : 2.233057 ns
nil
user> (let [m {:a {:b {:c {:d {:e {:f {:g {:h 8}}}}}}}}] (c/quick-bench (inline-get-in m [:a :b :c :d :e :f :g :h])))
Evaluation count : 8981112 in 6 samples of 1496852 calls.
Execution time mean : 64.743489 ns
Execution time std-deviation : 0.306182 ns
Execution time lower quantile : 64.396509 ns ( 2.5%)
Execution time upper quantile : 65.096493 ns (97.5%)
Overhead used : 2.233057 ns
nil
user> (/ 176 64.0)
2.75
user> (let [m {:a {:b {:c {:d {:e {:f {:g {:h 8}}}}}}}}] (c/quick-bench (inline-get-some-in m [:a :b :c :d :e :f :g :h])))
Evaluation count : 10402296 in 6 samples of 1733716 calls.
Execution time mean : 55.761201 ns
Execution time std-deviation : 0.315856 ns
Execution time lower quantile : 55.441573 ns ( 2.5%)
Execution time upper quantile : 56.038909 ns (97.5%)
Overhead used : 2.233057 ns
nil
user> (let [m {:a {:b {:c {:d {:e {:f {:g {:h 8}}}}}}}}] (c/quick-bench (get-in-map2 m [:a :b :c :d :e :f :g :h])))
Evaluation count : 10282824 in 6 samples of 1713804 calls.
Execution time mean : 56.341473 ns
Execution time std-deviation : 0.421970 ns
Execution time lower quantile : 55.958706 ns ( 2.5%)
Execution time upper quantile : 56.999571 ns (97.5%)
Overhead used : 2.233057 ns
nil
For the short circuiting case, the distance is much more drastic (as expected). I also think this is where not binding the additional let expansions for all the keys (which are just redundant nils) buys a meager couple of ns (less work) if the measurements at that scale are accurate (probably not due to clock precision):
user> (let [m {:a {:b {:c {:d {:e {:f {:g {:h 8}}}}}}}}] (c/quick-bench (get-in m [:a :z :c :d :e :f :g :h])))
Evaluation count : 4177110 in 6 samples of 696185 calls.
Execution time mean : 145.701169 ns
Execution time std-deviation : 4.284181 ns
Execution time lower quantile : 142.410853 ns ( 2.5%)
Execution time upper quantile : 152.026646 ns (97.5%)
Overhead used : 2.233057 ns
nil
user> (let [m {:a {:b {:c {:d {:e {:f {:g {:h 8}}}}}}}}] (c/quick-bench (inline-get-in-some m [:a :z :c :d :e :f :g :h])))
Syntax error compiling at (*cider-repl repos/m4:localhost:45614(clj)*:296:73).
Unable to resolve symbol: inline-get-in-some in this context
user> (let [m {:a {:b {:c {:d {:e {:f {:g {:h 8}}}}}}}}] (c/quick-bench (inline-get-some-in m [:a :z :c :d :e :f :g :h])))
Evaluation count : 27317034 in 6 samples of 4552839 calls.
Execution time mean : 19.969506 ns
Execution time std-deviation : 0.202335 ns
Execution time lower quantile : 19.758022 ns ( 2.5%)
Execution time upper quantile : 20.194388 ns (97.5%)
Overhead used : 2.233057 ns
nil
user> (let [m {:a {:b {:c {:d {:e {:f {:g {:h 8}}}}}}}}] (c/quick-bench (get-in-map2 m [:a :z :c :d :e :f :g :h])))
Evaluation count : 30419064 in 6 samples of 5069844 calls.
Execution time mean : 17.651500 ns
Execution time std-deviation : 0.181125 ns
Execution time lower quantile : 17.482592 ns ( 2.5%)
Execution time upper quantile : 17.880987 ns (97.5%)
Overhead used : 2.233057 ns
nil
On the order of 8.5x. This is probably less likely in practice, since I typically know what keys I want. However there are workloads where we're probing for the existence of a key that may or may not be there, and short circuiting is a big windfall. clojure.core/get-in should be doing this by default (I would not be surprised if there is an ancient ticket in JIRA too...)
Closing this issue because I think it's already covered by the existing implementation. If I missed anything please feel free to reopen it :)
So the default clojure.core nested operations like get-in can be optimized if we have a path of literals known a-priori. We can compile that down to a sequence of lookups (preferably optimized .get ops on a java.util.Map or .valAt, or for broad generality, clojure.core/get). This avoids creating a vector at runtime, reducing over the vector, etc.
We can't do that in the case where values are passed in at runtime and we don't know the compile-time path (e.g. we have a path vector that includes symbols).
We can still kick clojure.core/get-in to the curb if we replace the default implementation with one that's slightly restrictive on types, in this case java.util.Map:
The only restriction is that we provide a literal collection rather than a symbol for the path, but it works with normal clojure idioms.
I dug this up since I overlooked
get-in
before when doing deep-assoc and friends, and I did not at the time know of how costly clojure.core/get is.Oh, and unlike clojure.core/get-in, this variant is naturally short-circuiting if the path leads to nothing, so we get a bit more optimization (potentially a lot if there are deeply nested lookups):
Let me know if this is old news :)