linkml / linkml-runtime

Runtime support for linkml generated models
https://linkml.io/linkml/
Creative Commons Zero v1.0 Universal
22 stars 21 forks source link

Fix `linkml_files` #310

Closed sneakers-the-rat closed 3 months ago

sneakers-the-rat commented 3 months ago

on windows, the paths for local model files look like this:

'D:\\a\\linkml\\linkml\\.venv\\lib\\site-packages\\linkml_runtime\\linkml_model\\model/schema/types.yaml'

which seems to work, but it probably shouldn't.

change hardcoded / separators to use os.path.join to make paths correct on windows

Related to: https://github.com/linkml/linkml/pull/1976

codecov[bot] commented 3 months ago

Codecov Report

Attention: Patch coverage is 78.75000% with 17 lines in your changes are missing coverage. Please review.

Project coverage is 64.33%. Comparing base (33ca663) to head (c072024). Report is 12 commits behind head on main.

Files Patch % Lines
linkml_runtime/linkml_model/linkml_files.py 78.75% 17 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #310 +/- ## ========================================== + Coverage 62.92% 64.33% +1.40% ========================================== Files 62 62 Lines 8545 8580 +35 Branches 2437 2440 +3 ========================================== + Hits 5377 5520 +143 + Misses 2557 2450 -107 + Partials 611 610 -1 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

sneakers-the-rat commented 3 months ago

I think we should use pathib over os.path but either is infinitely better than hardcoding slash

just trying to follow the surrounding conventions :) ez to change

sneakers-the-rat commented 3 months ago

OK i went to swap out os.path.join with pathlib, and when i went to write a simple test that the generated paths exist, it turns out that most of them don't and there were a lot of other problems with the module, and since this is something i'd like to be able to rely on i went ahead and rewrote most of it.

Problems:

So this PR

You'll notice a lot of skipped tests. those are the ones that make network requests, which we need to cache in the same way we do with linkml. I noticed this repo is still using unittest in its CI, and didn't want to swap that out here too, so whenever we switch to pytest and add requests_cache those tests shoudl automatically switch back on.

sneakers-the-rat commented 3 months ago

ope i guess the tests aren't being run? idk how unittest works. ¯_(ツ)_/¯ if ya want 'em they're there!

cmungall commented 3 months ago

Thanks!

I'd like to test this with the main linkml project before merging into main, as there are possibly some assumptions on the enum values in linkml.__init()__ and possibly other places.

I realize the current SOP for these kinds of tests is suboptimal, so we can strategize ways to improve things.

I also realize I'm being very conservative here because as you point out a lot of this is legacy and things didn't break when paths changed..

I'm pretty certain this PR represents a much better way of doing things than before. But another approach might be to eliminate dependency on this file altogether. The metamodel should largely not operate so differently from other models, and including the cross product of all schema modules times formats is not a scalable strategy.

See https://github.com/linkml/linkml/issues/546

sneakers-the-rat commented 3 months ago

I realize the current SOP for these kinds of tests is suboptimal, so we can strategize ways to improve things.

We could make a manually triggered action that runs the linkml tests along with the PR version of linkml-runtime, and make a branch protection rule that says that action must be run and passing before merging to main? that would make it so we don't have to run all the tests all the time, but since these are tightly coupled packages, ensures that changes here don't break linkml.

But another approach might be to eliminate dependency on this file altogether. The metamodel should largely not operate so differently from other models,

I think it would be very useful to be able to at least get a reliable handle to the metamodel schema files from linkml-runtime - if not for perf reasons like using local schema rather than always dereferencing, then for introspection reasons. i agree that metamodel shouldn't be that special cased, but it still is a little special.

I could go either way on the github links part, i am not really sure when that would be useful.

including the cross product of all schema modules times formats is not a scalable strategy.

true, but with caching these tests would be pretty fast:

so if perf is a concern, imo we should drop the link generation part.

cmungall commented 3 months ago

I think it would be very useful to be able to at least get a reliable handle to the metamodel schema files from linkml-runtime

oh absolutely - my point was just that having an enormous list of N x M constants is not necessarily the best way to do this

sneakers-the-rat commented 3 months ago

ah i see. yes totally agreed - ideally this metadata should come from the generators, no? or maybe projectgen? i'm actually still a little foggy on the workflow for updating the linkml_model module but ya for sure on the same page re: a ton of constants not being a good longterm strategy

the thing we want here is 'given the metamodel and {projectgen or however the linkml_model is generated}, we should expect to have these files and these links,' so if it was possible to introspect that expectation from the generation process that sounds more sustainable. maybe this PR is just to fix this module for now and we make that as a TODO on it?