cidgoh / DataHarmonizer

A standardized browser-based spreadsheet editor and validator that can be run offline and locally, and which includes templates for SARS-CoV-2 and Monkeypox sampling data. This project, created by the Centre for Infectious Disease Genomics and One Health (CIDGOH), at Simon Fraser University, is now an open-source collaboration with contributions from the National Microbiome Data Collaborative (NMDC), the LinkML development team, and others.
MIT License
92 stars 25 forks source link

CanCOGeN template: all null values should be lower case with the first letter of each word capitalized #230

Closed griffie closed 2 years ago

griffie commented 2 years ago

Currently in the NML LIMS export, all the null values reflect those offered by the DH, which are lower case with the first letter of each word capitalized. This is what is needed for LIMS.

However, some users input null values in the free text fields. When they do this programmatically, sometimes those null values are all caps.

Can we ensure that any null value in any field has the proper capitalization? e.g. "NOT PROVIDED" or "not provided" or "Not provided" should always be "Not Provided".

Thanks!

ddooley commented 2 years ago

So each free text field needs to be searched for any of the standard null values, and have Title caps applied to them? This is somewhat computationally intensive; Can we at least say that the entire value of such a field will be a null value? That will speed up comparison.

griffie commented 2 years ago

Sure. Let's do that.

ddooley commented 2 years ago

I've implemented this just for NML LIMS, in the "vocabulary update" branch.