linkml / linkml-runtime

Runtime support for linkml generated models
https://linkml.io/linkml/
Creative Commons Zero v1.0 Universal
24 stars 23 forks source link

Make induced_slot recursive #332

Open sneakers-the-rat opened 1 month ago

sneakers-the-rat commented 1 month ago

Related to: https://github.com/linkml/linkml/issues/2219 and all the other times i gone and talked about making schemaview recursive.

Two main things we're trying to do here:

THere are a lot of places where schemaview will iterate over the whole schema in some way, sometimes in nested ways. We get a lot of mileage out of caching, but it also places some functional barriers in front of us where nonlocal effects become hard to diagnose, things get very bound up together, and implementing stuff like structured imports where we have to be able to handle lots of layers of different schemas with long inheritance chains gets v hard to do.

induced_slot is like my white whale, it takes an outsized amount of time because of the amount of checking up and down inheritance trees needs to get done, and it's also a critical stepping stone that we are sure is rock solid in order to make it never be in doubt whether we're looking at "the right" model class/etc. It's currently in some exponential time complexity state because for each slot for each class one needs to check the entire inheritance tree.

This is a start in the direction of a schemaview that only looks one step out at a time, recursively, so each step can be simpler. There are some bugs that i caused, and there are also some bugs that i think i am revealing (but have to figure out what they mean first), so it's not ready for review, but opening this as a draft. The general strategy is just that - to only look at the immediate parents of slots and classes so each slot/class combination is induced exactly once. in doing so, i am trying to keep each object as minimal as possible any only touch what is defined at each stage, but some of the methods of doing so are a bit costly, and i also am not sure about where to put the mutation guards yet so there are some missed/unnecessary copies done, but that's all tbd.

anyway here ya go, will return later.

perf status

current state of linkml and linkml-runtime (run on all non-slow tests, so the different would probably be greater since in the slow tests is where it gets really expensive)

Screenshot 2024-07-24 at 1 51 43 AM

this pr:

Screenshot 2024-07-24 at 1 51 52 AM

sort of weird result to me that there is 3.1s total time spent in the body of the function but snakeviz is showing 40s all collected there, probably just a visualization bug tho