linkml / template-configurator-templates

Templates for use in the linkml-template-configurator
Creative Commons Zero v1.0 Universal
0 stars 0 forks source link

Question about pypi layout #4

Closed hsolbrig closed 1 year ago

hsolbrig commented 3 years ago

At the moment the layout for a linkml based model is proposed as:

model-base/      -- everything that goes into github
    .github/           -- scripts to test and update the model online
    docs/              -- generated model documentation
    model_base/        -- Everything that is published in pypi
         graphl/
              mymodel.graphql   -- GraphQL schema for mymodel
         json/
              mymodel.json        -- JSON representation of mymodel in LinkML
         jsonld/
              mymodel.context.jsonld     -- context for converting an instance of mymodel.json into RDF format
         jsonschema/
              mymodel.schema.json       -- schemas to valid json instances of mymodel
         model/                                     -- COPY of model-base/mymodel/schema directory (READONLY)
              schema/
                    mymodel.yaml               -- YAML represention of mymodel in LinkML
         owl/
               mymodel.owl.ttl                  -- Turtle representation of mymodel definition in OWL
         rdf/
               mymodel.model.ttl             -- RDF represention of mymodel in LinkML
         shex/
                mymodel.shexj                  -- schemas to validate RDF instances of mymodel
         mymodel.py                             -- Python classes for my model
  model/                                             -- EDITABLE version of my model
      docs/                                           -- Additional documentation to be copied into model-base/docs directory
      schema/                                      -- YAML schemas for my model
           mymodel.yaml
  tests/                                               -- regression and other tests for mymodel validity
  .gitgnore
      ...
   Pipfile
      ...

The fact that mymodel.yaml exists in two separate spots in the hierarchy (model-base/model/schema AND model_base/model/schema) has led to confusion, where people end up editing the readonly model copy and the next make run wipes the changes out.

The second issue with the current configuration is that mymodel.yaml, mymodel.json and mymodel.ttl are all representations of "mymodel", ant it isn't really necessary that YAML be the source. Especially when we introduce ".csv" and other possible formats, it may be of value to maintain the official source in some other syntax.

First approach

There is a pull request in linkml-model (which, in this context, is just another instance of "mymodel") that removes the model-base/model directory and keeps the source in the PyPi image (model_base above and linkml_model). Upon closer examination, it isn't obvious that this is the path we want to take, the reasons being: 1) json/mymodel.json, rdf/mymodel.ttl and model\schema\mymodel.yaml all represent different forms of the same thing. Why are we treating them any different. 2) This approach muddles what seems like a good principle - that _everything in the PyPi (model_base) is read only.

Proposal

1) We return to the previous layout, where the docs and model_base directories are auto-generated. The model source files are contained entirely in the base model directory. 2) We remove the model directory from model_base 3) We add a model_base/yaml directory, which contains the YAML representation of the model 4) Model source definitions in the model/schema directory can be any of the allowable formats (yaml, json, rdf, csv (when finished), etc.) 5) We need to decide whether all of the yaml/json/rdf/csv directories in the PyPi image (model_base/ directory) should be generated or whether the exact source image is maintained instead or both.

WRT item 5, the question would be, were our model/schema directory to contain "mytypes.yaml" and "mymodel.json", would model_base/yaml contain the source mytypes.yaml and the generated mymodel.json and the mymodel/json directory, the generated mytypes.yaml and the source mymodel.json, or whether everything would be generated or whether we had some way to differentiate source from generated.

My own 2 cents on this question is that there isn't any real need to maintain the official source in the pypi distro -- that, as long as our generated YAML is faithful to the original (which it danged well should be), we shouldn't concern ourselves wither imports: [- mymodel:mytypes ] references yaml, json, rdf or something else and whether the import exactly matches the source line for line

Note: - I tried to find some way to tell setup.py that it needed emit model-base/model/schema as model_base/model/schema, but this seems to be a step too far for the generic setup utilities.

hsolbrig commented 3 years ago

Addendum:

The reason that we're so concerned about a fixed layout and by what goes into PyPi is that we want to be able to do the following:

id: https://example.org/myproject/mymodel

prefixes:
    linkml: https://w3id.org/linkml/
    yourmodel:  https://example.org/yourproject/yourmodel/

imports:
    - mysubmodel
    - yourmodel:yourschema
    - linkml:types
    - linkml:extensions

Right now the linkml prefix is hard-coded in the linkml generators. IF https://example.org/yourproject/yourmodel/yourschema.yaml (or, eventually, .json, .rdf, .csv) is accessible over the web, everything works. The following:

Pipfile:
     linkml-runtime = "*"
     your-model = "*"
          ...

won't know where to get a copy of "yourschema.yaml" unless we have a known layout in the site-packages sub-directory.

hsolbrig commented 3 years ago

Addendum 2:

The newest linkml_mode distribution has one additional file, linkml_files.py that provides the the additional information needed by importing packages. Examples of how it can be used can be found in test_linkml_files.py. This could easily be generalized in a way that the LinkML SchemaLoader could search the environment and aggregate the information in the linkml_files.py equivalent, which would allow us to import the versions of particular models we wanted to use or extend.