greglook / clj-cbor

Native Clojure CBOR codec implementation.
The Unlicense
70 stars 7 forks source link

Performance testing #3

Closed greglook closed 7 years ago

greglook commented 7 years ago

Generate or otherwise encode some big, hairy data structures and establish performance benchmarks for writing them. Compare this lib to a few other formats/libraries to determine relative performance.

greglook commented 7 years ago

@aengelberg did some basic testing during Seajure:

(def small-data
  {1 #{["a" :b 'c  (java.util.Date.)]}})

(def large-data
  (vec (repeatedly 100 #(gen/generate gen/any-printable))))

(defn bench-data
  [data]
  (println "Transit")
  (bench (transit-decode (transit-encode data)))
  (println "Nippy")
  (bench (nippy/thaw (nippy/freeze data)))
  (println "CBOR")
  (bench (first (cbor/decode (cbor/encode data)))))

(defn compare-sizes
  [data]
  {:transit (count (transit-encode data))
   :nippy (count (nippy/freeze data))
   :cbor (count (cbor/encode data))})

(bench-data small-data)
Transit 69 us
Nippy 9 us
CBOR 28 us

(bench-data large-data)
Transit 2.1 ms
Nippy 1.1 ms
CBOR 2.7 ms

(compare-sizes small-data)
{:transit 35, :nippy 30, :cbor 26}

(compare-sizes large-data)
{:transit 50338, :nippy 49063, :cbor 48684}
greglook commented 7 years ago

Been doing some work on this in the benchmark-harness branch.

greglook commented 7 years ago

See some initial results here: https://docs.google.com/spreadsheets/d/142LhWX5aCnOoF6v0T46RASULQDuG7JIckKiCohDPgq8/edit?usp=sharing

Looks like CBOR consistently has the smallest encoded size given the relatively simplistic sample data generated so far. It's a bit slower at decoding than the other codecs on small messages (< 128 bytes). Nippy remains the fastest codec, requiring a quarter to a half the time that clj-cbor does on average.

greglook commented 7 years ago

Done lots more benching, harness is merged into develop under the dev directory. Performance is actually pretty good for an initial implementation! It's definitely not the fastest of the bunch, but there's no egregious encoding or decoding issues and clj-cbor is roughly in-line with the rest of the libraries tested.