Shopify / yjit-bench

Set of benchmarks for the YJIT CRuby JIT compiler and other Ruby implementations.
MIT License
87 stars 22 forks source link

Active Record: remove the query cache #300

Closed casperisfine closed 6 months ago

casperisfine commented 6 months ago

Followup: https://github.com/Shopify/yjit-bench/pull/297

The idea was to totally skip the native code that can't be optimized much, but I suppose benchmarking the binding code kinda make sense anyway.

@maximecb @eregon

------------  -----------  ----------  ---------  ----------  ------------  -----------
bench         interp (ms)  stddev (%)  yjit (ms)  stddev (%)  yjit 1st itr  interp/yjit
activerecord  185.3        3.5         85.5       4.8         1.95          2.17       
------------  -----------  ----------  ---------  ----------  ------------  -----------
k0kubun commented 6 months ago

Don't we use this feature in production? If we do, we might want to make the cache hit ratio closer to an actual production environment, e.g. 1% cache miss and 99% cache hit. If the code to handle the cached case is used for 99% of the time, it's what matters the most.

eregon commented 6 months ago

If we do, we might want to make the cache hit ratio closer to an actual production environment, e.g. 1% cache miss and 99% cache hit.

The cache hit rate is likely to be very small for real apps, because the cache hit is only when doing exactly the same query twice (or more) in the same request, which I would think is rather rare. The cache does not work across requests: https://guides.rubyonrails.org/caching_with_rails.html#sql-caching

k0kubun commented 6 months ago

:+1: The change seems fair in that case.

casperisfine commented 6 months ago

The cache hit rate is likely to be very small for real apps,

That's what I thought a while ago, so I instrumented it in production hoping to make a cache to remove it, but the hit rate was surprisingly high (something like 10%, been a while).

Turns out, when you decouple code it's not uncommon for two pieces of code to need to fetch the same data, so you end up duplicating queries.

However I do agree that query cache hits should ideally not happen at all.