cidgoh / DataHarmonizer

A standardized browser-based spreadsheet editor and validator that can be run offline and locally, and which includes templates for SARS-CoV-2 and Monkeypox sampling data. This project, created by the Centre for Infectious Disease Genomics and One Health (CIDGOH), at Simon Fraser University, is now an open-source collaboration with contributions from the National Microbiome Data Collaborative (NMDC), the LinkML development team, and others.
MIT License
91 stars 23 forks source link

`linkml.py` does not generate a `schema.js` as expected but a `schema.json` #400

Closed cpauvert closed 1 year ago

cpauvert commented 1 year ago

Hello! thank you very much for developing and maintaining this awesome project! I am starting to work with a LinkML schema that I wanted to convert to be used with DataHarmonizer, but I'm a bit confused.

According to the wiki (https://github.com/cidgoh/DataHarmonizer/wiki/DataHarmonizer-Templates#building-schemajs), I should compile the LinkML schema with the script bundled in the repo or in the release. However, when I'm using the script on my LinkML YAML file, the only output that is produced is a JSON export of the schema (the menu.json is edited though), but no schema.js is produced.

Indeed, when looking at the code of linkml.py, there is no line indicating that a JS file should be written.

https://github.com/cidgoh/DataHarmonizer/blob/0e4fbefec0cd972e985166444af18e9945c3d078/script/linkml.py#L122

I also looked in the Makefile to be sure I did not miss anything, but the rules use also the linkml.py script.

Do I need to use a different script, or the fork by the NMDC (https://github.com/microbiomedata/DataHarmonizer)?

Thanks and best regards,

ddooley commented 1 year ago

Sorry for the somewhat outdated docs, I'll update that shortly. The DataHarmonizer build cycle has changed. Indeed when run in a /web/templates/[schema name]/ folder, linkml.py just produces a schema.json file from an existing schema.yaml file. (This pattern exists in all the template folders.)

This is from a recent correspondence (we have a separate approach with a tabular_to_data.py script but since you already have schema.yaml do this route:

Basic steps: With a [schema name] of your choice: Work in /web/templates/[schema name]/

classes:
  dh_interface:
    name: dh_interface
    description: A DataHarmonizer interface
    from_schema: https://example.com/AMBR
  AMBR:
    name: AMBR
    description: The AMBR Project, led by the Harrison Lab at the University of Calgary,
      is an interdisciplinary study aimed at using 16S sequencing as part of a culturomics
      platform to identify antibiotic potentiators from the natural products of microbiota.
      The AMBR DataHarmonizer template was designed to standardize contextual data
      associated with the isolate repository from this work.
    is_a: dh_interface
types:
  WhitespaceMinimizedString:
    name: 'WhitespaceMinimizedString'
    typeof: string
    description: 'A string that has all whitespace trimmed off of beginning and end, and all internal whitespace segments reduced to single spaces. Whitespace includes #x9 (tab), #xA (linefeed), and #xD (carriage return).'
    base: str
    uri: xsd:token
  Provenance:
    name: 'Provenance'
    typeof: string
    description: 'A field containing a DataHarmonizer versioning marker. It is issued by DataHarmonizer when validation is applied to a given row of data.'
    base: str
    uri: xsd:token

then with command prompt in that file’s template folder, run

python3 ../../../script/linkml.py -i schema.yaml

This will generate the schema.json file, it also adds a menu item for your specification by adjusting /web/templates/menu.js.

To test and run go to DH root folder and type (as documented on github main code page):

yarn dev

To build a stand alone set of JS files in /web/dist/

yarn build:web

These can then be zipped or copied separately to wherever you want to make them available.

Let me know if this works or if more info needs to be added to it! Then I'll revise docs.

Thanks!

cpauvert commented 1 year ago

Thank you @ddooley for the fast and detailed answer! It did the trick though maybe a few clarifications could be added:

I can add them to PR and you can edit from there. Thank very much!