cidgoh / DataHarmonizer

A standardized browser-based spreadsheet editor and validator that can be run offline and locally, and which includes templates for SARS-CoV-2 and Monkeypox sampling data. This project, created by the Centre for Infectious Disease Genomics and One Health (CIDGOH), at Simon Fraser University, is now an open-source collaboration with contributions from the National Microbiome Data Collaborative (NMDC), the LinkML development team, and others.
MIT License
92 stars 25 forks source link

VirusSeq export from CanCOGeN template: automate addition of study_ID #231

Closed griffie closed 2 years ago

griffie commented 2 years ago

In the VirusSeq export from the CanCOGeN template, there is a field that is added automatically called study_ID. The study_ID is a user-defined ID that is being used to group datasets from different submitters (i.e. provinces at this point, could be others in the future).

The study_ID for each province is unique.

To save the curator from having to add the study_ID to all the records being submitted to the portal manually (can sometimes be in the 1000's), can we automate the addition of the study_ID according to the lab name?

We have the list of study_IDs that correspond to each prov lab e.g. the study_ID for the BCCDC is "BCCDC-BC". The list of study_IDs and corresponding labs is given in this google spreadsheet: https://docs.google.com/spreadsheets/d/1L6ebnKLYQklMAaH90vQcDBLtAmGMeBMcYCEqhynxJhI/edit?usp=sharing

ddooley commented 2 years ago

So that list is complete now? The lookup would be easy enough to add, but would have to be hardcoded into export.js for now.

ddooley commented 2 years ago

field autofill function added to "vocabulary update" branch.