linkml / linkml

Linked Open Data Modeling Language
https://linkml.io/linkml
Other
321 stars 100 forks source link

FAQ entry about programmatically creating LinkML schemas #461

Open turbomam opened 2 years ago

turbomam commented 2 years ago

Should cover

Could use any of these approaches. Which is "best"? Why?

See also #452

cmungall commented 2 years ago

Ideally the FAQ entry should be short and point to more detailed documentation.

Do you think for this we should have

For now I am tending towards a notebook (this is easiest!), with the FAQ entry pointing to the notebook. Notebooks are a bit more technical than a nice markdown authored guide, but I think that is appropriate for users who want to do programmatic things.

Of course the notebook should make heavy use of markdown cells to fully illustrate what is going on, with the notebook telling a clear story that is easy to mentally follow and reproduce (some of our existing notebooks could be improved in that regard)

ddooley commented 2 years ago

So this is exactly what I need right now. I have a schemaView view but need to write it to JSON or YAML file so DataHarmonizer can read directly! Having installed linkml, I try to load "linkml" as an import but that seems missing. (linkml_runtime as an import works fine; I'm trying to install pipenv install linkml but that seems to trigger some kind of circular dependency between Cython and Numpy that I haven't figured out so I don't know if "linkml" as an import is somehow available that way).

ddooley commented 2 years ago

P.s. I have been able to generate a json.dumps(view, default=encoder) with a custom encoder. Next goal is to ensure that the dump only includes dicts and lists (and even other properties) which actually have values for the given schema entity, to be economical about entries.

cmungall commented 2 years ago

Having installed linkml, I try to load "linkml" as an import but that seems missing

That seems a fundamental python setup problem. Are you sure that linkml (and not just linkml-runtime) is in your list of dependencies?

I'm trying to install pipenv install linkml but that seems to trigger some kind of circular dependency between Cython and Numpy that I haven't figured out so I don't know if "linkml" as an import is somehow available that way).

Can you make a separate issue for this? I have never seen this before and I have installed linkml is many different contexts, it is also used as a dependency in a large number of github actions. I suspect there is something particular to your setup that is difficult to debug - but either way a fresh issue will help here.

cmungall commented 2 years ago

Regarding your broader issue, we need to explain the relationship between the different repos more here: https://linkml.io/linkml/developers/organization.html

I think for your use case, you will likely want to include linkml as a dependency, and use linkml.generators.yamlgen

You could just use linkml_runtime.dumpers.{json,yaml}_dumper, since a linkml Schema object is also data/instances.

Either way, I do not recommend using the generic json/yaml libs to dump objects - use the wrappers provided in linkml/linkml-runtime

ddooley commented 2 years ago

Thanks for responding on a saturday. Re. installing linkml, ok re. issue submission. I see now the linkml_runtime.dumpers so am trying to get that to work but maybe am making some basic code mistake. Trying to read in main spec file from local copy of https://github.com/GenomicsStandardsConsortium/mixs-source/tree/main/model/schema

mixs_sv = SchemaView("source/mixs.yaml")
data = mixs_sv.class_induced_slots("soil");
dumper = JSONDumper();
dumped = dumper.dumps(data)
print(dumped);

but getting

(base) dooley-pb:MIxS damion$ python linkml.py
Traceback (most recent call last):
  File "linkml.py", line 20, in <module>
    dumped = dumper.dumps(data)
  File "/Users/damion/anaconda3/lib/python3.7/site-packages/linkml_runtime/dumpers/json_dumper.py", line 44, in dumps
    return json.dumps(as_json_object(element, contexts, inject_type=inject_type),
  File "/Users/damion/anaconda3/lib/python3.7/site-packages/linkml_runtime/utils/yamlutils.py", line 294, in as_json_object
    rval['@type'] = element.__class__.__name__
TypeError: list indices must be integers or slices, not str

Did I call the function correctly?

cmungall commented 2 years ago

class_induced_slots returns a list of SlotDefinition objects

json_dumper dumps individual objects

what IDE are you using? It should warn youof signature mismatches

cmungall commented 2 years ago

For the broader task you are trying to achieve:

I think we should have a method in SchemaView that returns an induced class, where all induced slots are added as attributes

For now you can use code analogous to this:

https://github.com/linkml/linkml/blob/268a65a08a27f55401ba354be52c30105d186eff/linkml/generators/docgen.py#L382-L388

cmungall commented 2 years ago

^^ @turbomam

ddooley commented 2 years ago

Ok, I have an updated version of native LinkML DH that provides a menu of all MIxS specs in one go. It still needs unit-value data type validation and solution to MIxS patterns which can come in next few weeks. Its on the linkml-datastructure branch. I'll connect with @turbomam to poke at it!

sierra-moxon commented 1 year ago

Hi Patrick! :) Happy Friday!