oakes / odoyle-rules

A rules engine for Clojure(Script)
The Unlicense
530 stars 20 forks source link

Performance #7

Open bsless opened 3 years ago

bsless commented 3 years ago

Hello, I have a bad habit of finding cool project and looking under the hood for performance lying on the floor. After doing some poking around and running benchmarks I can share some findings:

I'll be happy to keep digging if you're interested.

Keep up the amazing work :)

oakes commented 3 years ago

These are great suggestions -- i haven't used clj-fast before. As for id+attrs, maybe i could calculate a keyword based on it and use that instead...but that feels like just implementing a second layer of hashing for map keys. Gotta think about it more...

bsless commented 3 years ago

Thank you :pray: I've done some more thinking myself since opening this issue: One issue I did not consider when running my benchmarks is that odoyle-rules is cljc and clj-fast targets JVM Clojure specifically. Also, odoyle-rules provided me with a good opportunity to find a few subtle bugs in my code, so if you plan to play with clj-fast let me know first so I'll release a new version with the fixes. Regarding the second layer of hashing, it may be worth it. Almost all CPU was wasted hashing and comparing id+attrs. I'm not sure generating a keyword from them would help as iirc id's type is any. Another solution is using a nested map of id->attr instead of a tuple id+attr. This is still faster. It does not, however, solve the bit where you use id+attrs (plural) as a key. Haven't figured out how to work around that one yet. Another approach to id+attr is generating a type with explicit equality and hashing semantics, some abomination like:

(deftype IdAttr [id attr ^:unsynchronized-mutable ^int _hasheq]
  java.lang.Object
  clojure.lang.ILookup
  clojure.lang.IHashEq
  clojure.lang.IKeywordLookup
  clojure.lang.IPersistentMap
  java.util.Map
  (equals [this that]
    (boolean
     (or (identical? this that)
         (and (instance? IdAttr that)
              (= id (.id ^IdAttr that))
              (= attr (.attr ^IdAttr that))))))
  (equiv
    [this that]
    (boolean
     (or (identical? this that)
         (and (instance? IdAttr that)
              (= id (.id ^IdAttr that))
              (= attr (.attr ^IdAttr that))))))
  (valAt
    [_ k else]
    (condp identical? k
      :id id
      :attr attr
      else))
  (valAt [this k] (.valAt this k nil))
  (get [this k] (.valAt this k))
  (seq [_] (seq [id attr]))
  (hasheq [_]
    (let [hq _hasheq]
      (if (zero? hq)
        (let [h (int
                 (bit-xor
                  968592507
                  (Murmur3/mixCollHash (unchecked-add-int (int (Util/hasheq id)) (int (Util/hasheq attr))) (int 2))))]
          (set! _hasheq h)
          h)
        hq))))

Which works (really works, tests pass and all), but no one wants to look at or maintain

oakes commented 3 years ago

That is really neat, i'll try it when i get the chance. Maybe one way to avoid the complexity and maintenance burden would be to just provide a way to supply an id+attr constructor function to ->session as an option. That way, code like this could live in your project and be open to tweaking, if you need the extra performance.

bsless commented 3 years ago

On the other hand, when you open that door someone will use it to blow their own foot off. I'll play around with this some more until I reach a satisfactory solution. Even by unrolling Records' hash and equality semantics there's a huge gain to be had.

oakes commented 3 years ago

That is true...and this id+attr thing would be a weird internal detail to expose, now that i think of it.