cidgoh / DataHarmonizer

A standardized browser-based spreadsheet editor and validator that can be run offline and locally, and which includes templates for SARS-CoV-2 and Monkeypox sampling data. This project, created by the Centre for Infectious Disease Genomics and One Health (CIDGOH), at Simon Fraser University, is now an open-source collaboration with contributions from the National Microbiome Data Collaborative (NMDC), the LinkML development team, and others.
MIT License
91 stars 23 forks source link

DataHarmonizer should honor at least a subset of the LinkML rule language #370

Closed turbomam closed 6 months ago

turbomam commented 1 year ago

For example, a schema might ask if sequencing samples will be delivered as tubes or in micro-titer "96-well" plates.

If the user picks "plate" from the enumeration in column N, then a legal value should be required in column N+1, 'well position'.

If the user picks "tube" from the enumeration in column N, then the only legal value in N+1 would be a blank cell.

assign to @pkalita-lbl? Patrick, could your multi-view proposal solve this without requiring any other changes to the DataHarmonizer codebase?

cmungall commented 1 year ago

Relevant section of the schema guide: https://linkml.io/linkml/schemas/advanced.html#rules

(note there is no way to say column N+1, but a specific slot can be named)

pkalita-lbl commented 1 year ago

This seems orthogonal to the multi-view interface I've been working on for NMDC.

I think DataHarmonizer could pick up the rules metamodel slot in its validation procedure, and it wouldn't be too terribly hard to do.

ddooley commented 1 year ago

Yes, we'd like to get rule-based validation - and dynamically required fields - happening right inside DH. Patrick, what does "multi-view interface" mean? Were you approaching this via custom field-change hooks? One thing is that not only should all the rule behaviour happen as people are editing content, but also on loading of a dataset or when user presses "Validate".

pkalita-lbl commented 1 year ago

what does "multi-view interface" mean?

I'm working on an interface for the NMDC project where we're going to allow a user to switch between different templates via a sort of tab-like interface. It's not really related to what we're talking about here.

Were you approaching this via custom field-change hooks? One thing is that not only should all the rule behaviour happen as people are editing content, but also on loading of a dataset or when user presses "Validate".

Just to be clear, I have not started working on implementing this at all. I assumed it would be added to the getInvalidCells method.

ddooley commented 1 year ago

Understood. A heads-up that early this spring we will have another programmer resource dedicated to DH work so no worries if you have commitments elsewhere.

ddooley commented 8 months ago

Note I have closed https://github.com/cidgoh/DataHarmonizer/issues/153 as it will be handled here.