cidgoh / DataHarmonizer

A standardized browser-based spreadsheet editor and validator that can be run offline and locally, and which includes templates for SARS-CoV-2 and Monkeypox sampling data. This project, created by the Centre for Infectious Disease Genomics and One Health (CIDGOH), at Simon Fraser University, is now an open-source collaboration with contributions from the National Microbiome Data Collaborative (NMDC), the LinkML development team, and others.
MIT License
97 stars 27 forks source link

CLI DataHarmonizer dh-validate.py script for validating tsv, csv, xls, and xlsx content files #443

Closed ddooley closed 2 months ago

ddooley commented 2 months ago

This takes care of all the validation issues that show up when trying to validate DataHarmonizer-produced tabular data (in tsv, csv, xls, and xlsx formats) using linkml-validate cli. The strategy involves creating a temporary .yaml output file with needed changes to make linkml-validate work well on it. Namely:

We haven't tested it on .json / .yaml / .yml input but that may work.