cidgoh / DataHarmonizer

A standardized browser-based spreadsheet editor and validator that can be run offline and locally, and which includes templates for SARS-CoV-2 and Monkeypox sampling data. This project, created by the Centre for Infectious Disease Genomics and One Health (CIDGOH), at Simon Fraser University, is now an open-source collaboration with contributions from the National Microbiome Data Collaborative (NMDC), the LinkML development team, and others.
MIT License
90 stars 23 forks source link

enforce id_prefixes? #315

Open turbomam opened 2 years ago

turbomam commented 2 years ago

LinkML provides an id_prefixes attribute for slots. We could use this to require a user enters the link between a biosample and the EMP500 project as BIOPROJECT: PRJEB42019 by specifying

- id_prefixes
  - BIOPROJECT
ddooley commented 2 years ago

I think this could be enforced by a two fold validation: 1: validating that the "project" slot was a curie or URI, and syntactically met that; and then by insisting that any Curie has a prefix listed in the schema. But perhaps that is too strong a requirement? Are there other curies that don't need to be validated to that extent?