Closed timothy-trinidad-ps closed 3 months ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 62.88%. Comparing base (
27b9158
) to head (0be1913
). Report is 3 commits behind head on main.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@timothy-trinidad-ps - looking lru doc, it seems like the default is "128" items in the cache and this PR takes it to "limitless" items in the cache.
Do you think it would be helpful to take this item-number up in increments of 100 or so until we see a significant speedup without going limitless?
I'm open to it, but any other number seemed just as arbitrary to me as 128. If I up it to 256 I know that we will run into this problem again in the next 6 months (we're in the process of annotating our database tables).
The obvious downside to this would be memory leaks for very large schemas, but I'm not entirely sure how to measure that. I think memory usage growing proportional to the schema size would be more expected than an arbitrary threshold after which the program effectively stops working.
I can agree with that :) - is there anything we can do to prevent a very large schema from running into problems with too big a cache? Said another way, what is the largest schema size you've tested this on so far?
Just this 129. I can automate creating a schema with 1024 tables, but the specific use case I'm testing with is with gen-doc
and the functions called by its dependent templates. What I would need help with is determining how to test other use cases of schemaview.py
when it comes to memory usage.
I just ran a benchmark for a file with 2,048 classes and 10,240 slots - it took about 4 minutes with a peak of about 450 MB memory (at least for the gen-doc
use case).
test.yaml.txt
Fixes https://github.com/linkml/linkml/issues/1973.
I'm starting with a really naive fix of just removing the entry limit for the
lru_cache
, but I'm open to pivoting to other solutions.With this change, running
gen-doc
on the following yaml file goes down from 3+ minutes to 7s.test.yaml.txt