Cache direct conversions

javierhonduco commented 6 years ago

This PR is a follow-up on #105.

Why

We are calling #find_direct_conversions thousands of times to perform the same computation in the majority of cases. This takes a bunch of time. The function is pure (e.g. for the same arguments, should return the exact same result), so we should be able to cache the results.

Proposed solution

I've added a new entrypoint function, #find_direct_conversion_cached that calls our beloved find_direct_conversion, and caches its result.

Some numbers

The whole test suite takes 0.30s to finish vs the 1.15s of #105
Running the same "benchmark" as in #105, this yields 6.503 (±15.4%) i/s vs the current 1.737 (± 0.0%) i/s that that PR introduced

This approach would trade off some more memory usage at the expense of all the CPU time we are currently using. I didn't measure it but this shouldn't be too big.

I would love to get your feedback (Kevin gave some very good feedback on the way I was creating some cache keys before creating the PR, thanks!! <3) on this. I would like to improve the naming of the functions and general organisation of this patch, as it's not very good as of now

kmcphillips commented 6 years ago

This is very good. And the memory size is small compared to even what's probably leaked from generating all the objects needed to recalculate this each time.

I'm surprised your performance numbers are so drastically different. I tried this approach a couple days ago and tossed it out because the performance improvement was so minimal. But it's possible I was just wrong.

kmcphillips commented 6 years ago

You're caching failures to find a direct conversion too. My approach was only caching hits, not misses. That makes all the difference. Nice work!

Going to ponder on the approach a bit. I really dislike nil and false having different semantic meanings in this case.

kmcphillips commented 6 years ago

@javierhonduco How about @cache[to].key?(from) instead? Then you can drop the false altogether. Have it continue to return nil. And just create the entry in the secondary hash anytime you query it?

javierhonduco commented 6 years ago

Thanks a lot for your reviews 💞 Will do 😄

Shopify / measured

Cache direct conversions #111

Why

Proposed solution

Some numbers