bsless / clj-fast

Unpredictably faster Clojure
Eclipse Public License 2.0
234 stars 1 forks source link

Operations on ordered collections (probing) can be faster #18

Closed joinr closed 3 years ago

joinr commented 3 years ago

For reference, I am using a sorted map to denote a set of intervals. I would like to quickly do intersection tests for an arbitrary point x to see if it intersects with a known interval. So, for the ordered map {a b}, the key a denotes the left bound of an interval, and b denotes the right, e.g. [a b]. If a value is contained by the interval, it satisfies the properties of (>= x a), (<= x b). Since we already have sorted maps out of the box, this seems like a simple use case. We can leverage the operation rsubseq to leverage the tree's structure and search for entries where the key fits our first criteria, then test the val for the second criteria. Notably, clojure.data.avl provides more robust features and is possibly faster, so we'll include it as an option.

Let's look at performance.

(require '[criterium.core :as c])
(require '[clojure.data.avl :as avl])

(def samples (sorted-map 10 20 35 40 50 60))
(def avl-samples (avl/sorted-map 10 20 35 40 50 60))

;;slow but portable version...
(defn slow-intersection [sm k]
  (when-let [ab (first (rsubseq sm <= k))]
    (when (<= k (val ab))
      ab)))
;;user> (c/quick-bench (slow-intersection samples 10))
;;Execution time mean : 1.142835 µs

;;fast but not portable version...
(defn intersection [sm k]
  (when-let [ab (when-let [xs (.seqFrom ^clojure.lang.PersistentTreeMap sm k false)]
                  (.first ^clojure.lang.ISeq xs))]
    (when (<= k (.getValue ^java.util.Map$Entry ab))
      ab)))

;;user> (c/quick-bench (intersection samples 10))
;;Execution time mean : 112.924740 ns

(set! *unchecked-math* true)
(defn fast-intersection [sm k]
  (when-let [ab (when-let [xs (.seqFrom ^clojure.lang.PersistentTreeMap sm k false)]
                  (.first ^clojure.lang.ISeq xs))]
    (when (<= ^long k ^long (.getValue ^java.util.Map$Entry ab))
      ab)))
(set! *unchecked-math* false)
;;user> (c/quick-bench (fast-intersection samples 10))
;;Execution time mean : 91.716445 ns

(defn avl-intersection [sm k]
  (when-let [ab (avl/nearest sm <= k)]
    (when (<= k (val ab))
      ab)))
;;Execution time mean : 163.168134 ns

(defn fast-avl-intersection [sm k]
  (when-let [ab (avl/nearest sm <= k)]
    (when (<= k (.val ^clojure.lang.MapEntry ab))
      ab)))
;;Execution time mean : 156.294912 ns

(set! *unchecked-math* true)
(defn fastest-avl-intersection [sm k]
  (when-let [ab (avl/nearest sm <= k)]
    (when (<= ^long k ^long (.val ^clojure.lang.MapEntry ab))
      ab)))
(set! *unchecked-math* false)
;; user> (c/quick-bench (fastest-avl-intersection avl-samples 10))
;; Execution time mean : 142.348268 ns

So at least for this operation (maybe there other others), direct method invocation and hinting really helps out in the peformance. 112ns vs 1.4microseconds is pretty substantial. This could open the door to additional performance questions related to the less often used sorted collections in core.

bsless commented 3 years ago

Managed to find some meat while maintaining portability. Anything else sticks out?


  (defn -rsubseq
    ([^clojure.lang.Sorted sc test key]
     (let [include (mk-bound-fn sc test key)]
       (if (or (identical? < test) (identical? <= test))
         (when-let [s (. sc seqFrom key false)]
           (let [e (.first ^clojure.lang.ISeq s)]
             (if (include e) s (next s))))
         (take-while include (. sc seq false)))))
    ([^clojure.lang.Sorted sc start-test start-key end-test end-key]
     (when-let [[e :as s] (. sc seqFrom end-key false)]
       (take-while (mk-bound-fn sc start-test start-key)
                   (if ((mk-bound-fn sc end-test end-key) e) s (next s))))))
bsless commented 3 years ago

Even faster if you consider the happy path


(defn <mk-bound-fn
  {:private true}
  [^clojure.lang.Sorted sc key]
  (fn [e]
    (< ^long (.. sc comparator (compare (. sc entryKey e) key)) 0)))

(defn <=mk-bound-fn
  {:private true}
  [^clojure.lang.Sorted sc key]
  (fn [e]
    (<= ^long (.. sc comparator (compare (. sc entryKey e) key)) 0)))

(defn mk-bound-fn
  {:private true}
  [^clojure.lang.Sorted sc test key]
  (fn [e]
    (test ^long (.. sc comparator (compare (. sc entryKey e) key)) 0)))

(defn -seek
  [include ^clojure.lang.Sorted  sc key]
  (when-let [s (. sc seqFrom key false)]
    (let [e (.first ^clojure.lang.ISeq s)]
      (if (include e) s (next s)))))

(defn -rsubseq
  ([^clojure.lang.Sorted sc test key]
   (cond
     (identical? < test)
     (let [include (<mk-bound-fn sc key)]
       (-seek include sc key))
     (identical? <= test)
     (let [include (<=mk-bound-fn sc key)]
       (-seek include sc key))
     true
     (let [include (mk-bound-fn sc test key)]
       (take-while include (. sc seq false)))))
  ([^clojure.lang.Sorted sc start-test start-key end-test end-key]
   (when-let [[e :as s] (. sc seqFrom end-key false)]
     (take-while (mk-bound-fn sc start-test start-key)
                 (if ((mk-bound-fn sc end-test end-key) e) s (next s))))))
joinr commented 3 years ago

Very interesting. I'll look at those.

bsless commented 3 years ago

Also profiled, see the attached svg (as txt, sorry) for ref. 05-cpu-flamegraph-2020-12-15-17-57-28.txt

bsless commented 3 years ago

Finally got around to finish this. If you drop in the new implementations you should see a ~10x speed up for the test cases you've provided on a core sorted map

joinr commented 3 years ago

good deal 👍