GenomicsStandardsConsortium / mixs

Minimum Information about any (X) Sequence” (MIxS) specification
Creative Commons Zero v1.0 Universal
33 stars 20 forks source link

Create minimal per-class slot reporter; compare to exisiting LinkML, NMDC and MIxS scripts #811

Open turbomam opened 2 days ago

turbomam commented 2 days ago

I suspect that this overlaps with a lot of existing issues, along the lines of "make it easy for people to find release artifacts"

turbomam commented 2 days ago

Starting by generating Markdown documentation for all of the scripts in src/scripts. Using ChatGPT 4.

turbomam commented 2 days ago
find . -name "*.py" | sort




turbomam commented 2 days ago

I moved some of those scripts to an inactive directory

src/scripts/ is derrived from output of linkml2schemasheets-template and comes closest to generating a tabular representation of the schema, but it creates one global slot report, and we want per-class slot reports, showing the induced slot usage.

(I don't think we have determined what takes precedent yet, when a class has a is_a parent and uses mixins.)

turbomam commented 2 days ago

I am inclined to read the schema into a SchemaView, iterate over the in-scope classes, and report selected properties of the induced slots/attributes.

@cmungall would prefer that this kind of thing is developed in a LinkML repo, but the turnaround is slower. So develop with in a generalizable way with good coding practices so that it can eventually be moved into LinkML

I don't think the stock linkml2sheets will work here either.

turbomam commented 2 days ago

form @pbuttigieg: generate the reports in a dev path and then move them to a root path upon release, like ODK does

turbomam commented 2 days ago

This could be backported to older releases. but the reports for previous releases wouldn't be bundled cumulatively with each new release