cidgoh / DataHarmonizer

A standardized browser-based spreadsheet editor and validator that can be run offline and locally, and which includes templates for SARS-CoV-2 and Monkeypox sampling data. This project, created by the Centre for Infectious Disease Genomics and One Health (CIDGOH), at Simon Fraser University, is now an open-source collaboration with contributions from the National Microbiome Data Collaborative (NMDC), the LinkML development team, and others.
MIT License
91 stars 25 forks source link

CanCOGeN template: single null value in concatenated fields #271

Closed griffie closed 2 years ago

griffie commented 2 years ago

Currently, there are a few fields in the CanCOGeN template that are concatenated into a single field in the LIMS export. If null values have been entered into the individual fields, upon export, all those null values are concatenated in the single target field e.g. "Not Provided; Not Provided; Not Provided; Not Provided".

Multiple null values complicates integration into LIMS.

There are a few other fields where data providers have been entering multiple null values as well.

Can the DH check for repetitive null values in all fields, and replace the multiple values with one null value upon validation/export so that things like "Not Provided; Not Provided; Not Provided; Not Provided" just show up as "Not Provided"?

Such a conversion is already happening for the concatenated travel fields. Can we extend to all fields as part of the export transformation?

ddooley commented 2 years ago

This is now implemented in master branch. After you verify its behaviour we can include in next release!