cidgoh / DataHarmonizer

A standardized browser-based spreadsheet editor and validator that can be run offline and locally, and which includes templates for SARS-CoV-2 and Monkeypox sampling data. This project, created by the Centre for Infectious Disease Genomics and One Health (CIDGOH), at Simon Fraser University, is now an open-source collaboration with contributions from the National Microbiome Data Collaborative (NMDC), the LinkML development team, and others.
MIT License
91 stars 25 forks source link

CanCOGeN template: remove null values concatenated in PH_TRAVEL in LIMS export #242

Closed griffie closed 2 years ago

griffie commented 2 years ago

Currently travel information is concatenated into PH_TRAVEL in the NML-LIMS export.

Often, most fields are populated with null values. Sometimes, there are multiple null values and one or two nuggets of travel info. e.g. Not Provided;Not Provided;Not Provided;Not Provided;Not Provided;HALIFAX

If there are multiple null values in PH_TRAVEL, can we remove them so that only the non-null values remain?

If there are only null values, can we remove all but one of them to indicate the field was addressed, but no more?

Thanks!

ddooley commented 2 years ago

I had a vague memory that NML was actually parsing the semi-colon delimited list back into multiple fields? If so then at least the semi-colon placeholders have to be kept? But they could be empty values, eg. Missing;;;;; ? But if they aren't demultiplexing back into city, country etc fields, then yes we can collapse the list into a minimal one.

griffie commented 2 years ago

They are not separating into different fields. What you are remembering may be a relic of CNPHI (which is different than LIMS).

On the LIMS end, everything is just going into PH_TRAVEL.

ddooley commented 2 years ago

Ok, done!