linkml / linkml-runtime

Runtime support for linkml generated models
https://linkml.io/linkml/
Creative Commons Zero v1.0 Universal
24 stars 23 forks source link

[perf] Cache schemaview hash #329

Closed sneakers-the-rat closed 2 weeks ago

sneakers-the-rat commented 1 month ago

Here's a fun one:

How come ~1/4 of the time in induced_slot is spent on SchemaView.__hash__?

Screenshot 2024-07-22 at 6 55 34 PM

In my tests generating the NWB schema, the __hash__ method is called 1612428 times, that's a lot! by comparison induced_slot is only called 3004 times in these tests.

Screenshot 2024-07-22 at 6 58 41 PM

How come __hash__ gets called so many times, or i guess a better question is why is it called at all? Turns out it's called every time a method wrapped with lru_cache is called - since self is an argument to a method, and lru_cache uses the hashes of the arguments to match cached results, it has to hash SchemaView every time, but the value is always the same (except when modified is incremented).

So this PR just stores the value of the hash and returns that, invalidating when modified is incremented. ezpz

codecov[bot] commented 1 month ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 63.45%. Comparing base (37297a8) to head (d0df01b). Report is 4 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #329 +/- ## ========================================== + Coverage 63.44% 63.45% +0.01% ========================================== Files 63 63 Lines 8942 8946 +4 Branches 2569 2570 +1 ========================================== + Hits 5673 5677 +4 Misses 2648 2648 Partials 621 621 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

sneakers-the-rat commented 3 weeks ago

running into this again, and in case we were holding off bc the X is red, this was before we fixed the upstream tests (but i still can't trigger them, so manually trigger to check and we can get things a lil faster for free :)

edit: just rebased and now the check is green, but i would still want to see an upstream test before we merge to be sure :)