brandonbloom / fipp

Fast Idiomatic Pretty Printer for Clojure
525 stars 44 forks source link

`fipp.ednize/override?` uses `satisfies?` #87

Open bsless opened 6 months ago

bsless commented 6 months ago

satisfies? has very poor performance, and the way visit* works, it calls override? for each object it visits, incurring a pretty big overhead.

An easy alternative to using satisfies? is:

(defprotocol IOverride
  "Mark object as preferring its custom IEdn behavior."
  (-override? [this]))

(defn override?
  [this]
  (-override? this))

(extend-protocol IOverride
  Object (-override? [_] false)
  nil (-override? [_] false))

The problem with such change is it breaks current users, of which there are several https://github.com/search?q=IOverride+fipp.ednize&type=code

This leaves four:

Any solution that finds a way to remove the call to satisfies? will make fipp way faster.

Happy to help with whatever solution you think will be correct, wdyt?

brandonbloom commented 6 months ago

satisfies? has very poor performance

Can you quantify "very poor" performance? I'm hesitant to make breaking changes or in fact any changes at all unless the win is pretty substantial. As it is, Fipp is more than fast enough for every use case I've thrown at it so far. That's not to say it can't always be faster, just that I'm not yet convinced it's worth it.

bsless commented 5 months ago

I see your point - I should perhaps qualify the code path by which I observed it, which was the request diff middleware from malli -> deep-diff2 -> fipp. That middleware prints the changes to a HTTP request between every middleware stage. With plenty of middleware and a decent size request it does plenty of printing. I noticed it caused some slowness and friction during development, which led to profiling, saw calls to satisfies? , did some investigation, and this is where I ended up. In theory I'd like to just leave this on during development and testing, but the overhead might not always be worth it, so I went looking for a solution. I understand the reluctance to make a breaking change, which is why I proposed an alternative code path.

For a rough estimate of the overhead of calling satisfies? I just commented that line out of visit* and ran the benchmarks, results summary:

function case before after
fipp.edn/pprint :long 55.601581 ms 21.842100 ms
fipp.edn/pprint :mixed 23.416590 ms 12.005053 ms
fipp.clojure/pprint :long 56.669671 ms 22.299918 ms
fipp.clojure/pprint :mixed 23.063635 ms 11.619572 ms

I understand if you don't think a 2x improvement is worth making any change in this case. Another small change can be using eductions instead of concat which brings the speedup to about 3x. I'm happy to pursue this further to make it go even faster if you're interested, or I could just put it to rest :slightly_smiling_face:

brandonbloom commented 5 months ago

OK, I think you've convinced me that it's worth perusing the wins, but I'm not super comfortable with the breaking change route.

Ideally, we wouldn't have to change anything and Clojure would just be faster. See https://clojure.atlassian.net/browse/CLJ-1814 for a proposal to speed up satisfies?. I've added a note over there.

Cache the results of calling satisfies?. This has the downsides of being influenced by load order. Possible mitigation is attaching the cache to the visitor instead of making it global.

This seems like the least invasive solution, at least until CLJ-1814 is resolved. Maybe someone has already figured out how to implement a global cache that is discarded if underlying protocol mappings/extensions have changed? If not, could we figure that out? If not, I guess a local cache would get the job done.

Another small change can be using eductions instead of concat which brings the speedup to about 3x.

PR welcome for that separately. Thanks!

bsless commented 5 months ago

Ideally Clojure would be faster, although if we take a more philosophical approach, one never needs call satisfies? as any use of protocols outside of markers (first time I'm seeing protocols used that way) will always include methods users can extend to; But that cat's out of the bag, so I'll be happy to try and pursue this solution. I'll try to have some draft ready within the week.

bsless commented 5 months ago

Sketched out a version of cached satisfies, wdyt?

(defn cache-satisfies
  ([prot]
   (let [cache (volatile! {})]
     (fn [x]
       (cache-satisfies prot cache x))))
  ([prot cache x]
   (let [clazz (class x)
         ret (get @cache clazz ::not-found)]
     (if (identical? ::not-found ret)
       (let [ret (satisfies? prot x)]
         (vswap! cache assoc clazz ret)
         ret)
       ret))))

It's about 100x faster than satisfies? which should mostly solve this