bsless / clj-fast

Unpredictably faster Clojure
Eclipse Public License 2.0
242 stars 1 forks source link

Extend assoc-in to take multiple k-v pairs? #23

Closed yuhan0 closed 3 years ago

yuhan0 commented 3 years ago

Sometimes when I want to do several updates on a deeply nested map, I end up writing something like this:

(-> m
  (assoc-in [:x :y] 1)
  (assoc-in [:x :z :a] 2)
  (assoc-in [:x :z :b] 3)
  (assoc-in [:x :z :c] 4))

Replacing with the clj-fast inlined assoc-in does not help much, because there's still many redundant operations being made across macro boundaries (eg. (get m :x) being calculated 4 times).

Instead, it should be possible for a macro that takes multiple kv pairs to statically unroll it into something like:

(inline/assoc-in m
  [:x :y]    1
  [:x :z :a] 2
  [:x :z :b] 3
  [:x :z :c] 4)

;; expands into
(let [x (get m :x)
      z (get x :z)]
  (assoc m :x
    (-> x
      (assoc :y 1)
      (assoc :z (-> z
                  (assoc :a 2)
                  (assoc :b 3)
                  (assoc :c 4))))))

I did a quick criterium benchmark and found it to be 6x faster than the naive version above, and 4.5x faster than a drop-in replacement of the inline/assoc-in.

I haven't looked too closely at the implementation details, but is this something that's achievable or within the scope of this library?

Thanks!

bsless commented 3 years ago

Your intuition here is correct, I've done something similar in the past which can be generalized to this. The idea is to build a trie then derive an execution model from it. Good find :)

yuhan0 commented 3 years ago

I guess there's also the assumption that individual keys are referentially transparent, and can actually be combined like so:

(inline/assoc-in {}
  [(rand) :x] :oh
  [(rand) :y] :no)

The above transform would change the behavior:

(let [m {}
      r (get m (rand))]
  (assoc m (rand)
    (-> r
      (assoc :x :oh)
      (assoc :y :no))))

unless we use a heuristic that only atomic keys (keywords/symbols/numbers/strings) can be tested for equality and combined safely.

bsless commented 3 years ago

A heuristic is better than nothing. Dropping this here for reference, it can be adapted for a solution: https://github.com/bsless/impedance

bsless commented 3 years ago

@yuhan0 I opened a PR which should provide this feature, you're welcome to take a look