microbiomedata / sheets_and_friends

Enhance a LinkML model with imported and optionally modified slots
0 stars 0 forks source link

decision about section & column ordering semantics #70

Closed turbomam closed 2 years ago

turbomam commented 2 years ago

I.e. as an alternative to using annotations (dh_sect_ord, dh:section_name and dh:column_number)

Especially don't use the parent class nmdc_dh_section in whatever mechanism we come up with

This issue should deprecate several older issues mentioning things like subsets #2, #58

@turbomam's idea:

group columns into sections with https://linkml.io/linkml-model/docs/slot_group/#slot-slot_group

remember that parent slots for each section will have to be created (as opposed to our current sections_as_classes tab from the sheet 'nmdc-dh-sheets')

remember that the slot titles can use spaces, punctuation and mixed case but the slot names shouldn't order the columns with https://linkml.io/linkml-model/docs/rank/

turbomam commented 2 years ago

All of these modelling changes will require changes to our LinkML -> DH code, or a switch to Damion's LinkML -> DH code if we like it

turbomam commented 2 years ago

But we shouldn't make changes to the Google Sheet that's currently in use!

May need to reconfigure cogs and generate and other server account file? Or grant permissions on a new Google Sheet to the existing authentication file?

turbomam commented 2 years ago

see also #56

turbomam commented 2 years ago

Tab sections_as_classes from nmdc-dh-sheets defines several DH sections, but many of those are not used (X)

title Used in
Sample ID new_terms
GOLD ecosystem path X
EMSL new_terms
JGI-Metagenomics new_terms
JGI-Metatranscriptomics new_terms
Metadata- MIxS Modified Required X
Metadata- MIxS Required X
Metadata- MIxS Modified Required Where Applicable X
Metadata- MIxS Required Where Applicable X
Metadata- MIxS Modified Optional X
Metadata- MIxS Optional X

What sections really do appear in the soil_emsl_jgi_mg interface?

Just now ran

In branch main, against nmdc-dh-sheets (nmdc_schemasheet_key=1RACmVPhqpfm2ELm152CzmiEy2sDmULmbN9G0qXK8NDs)

DH sections are found in docs/template/soil_emsl_jgi_mg/data.tsv, with parent class=''

Several of the sections aren't defined anywhere. That's not getting caught now as we are defining sections with annotations. Defining sections will be essential when we switch to slot_group

sections notes Defined in Used in
Sample ID   sections_as_classes new_terms
EMSL   sections_as_classes new_terms
JGI-Metagenomics Same for JGI-Metatranscriptomics (just not used in soil_emsl_jgi_mg DH template) sections_as_classes new_terms
MIxS   X sections_columns_orders
MIxS (modified) sections_columns_orders needs to be refactored wider, with changes to sheets_and_friends/mod_by_path.py:mod_by_path X sections_columns_orders
MIxS Inspired   X new_terms