cidgoh / DataHarmonizer

A standardized browser-based spreadsheet editor and validator that can be run offline and locally, and which includes templates for SARS-CoV-2 and Monkeypox sampling data. This project, created by the Centre for Infectious Disease Genomics and One Health (CIDGOH), at Simon Fraser University, is now an open-source collaboration with contributions from the National Microbiome Data Collaborative (NMDC), the LinkML development team, and others.
MIT License
91 stars 23 forks source link

Add method for getting data as array of objects #343

Closed pkalita-lbl closed 1 year ago

pkalita-lbl commented 1 year ago

There are three main things happening here:

  1. I added a getDataObjects method do the DataHarmonizer class which returns the table data as an array of objects. Each object represents a row. They keys are field names and the values are the the cell data for the given field in the row. Most of the real work is done by dataArrayToObject in lib/utils/fields.js.
  2. Since getDataObjects is designed to work with JSON serialization I thought it would be good to parse the strings returned by hot.getData() into native types based on the schema. That parsing work was already being done in getInvalidCells so I extracted it into a new module (lib/utils/datatypes.js) so that it can be shared with getDataObjects.
  3. I've added unit testing via Jest to ensure that the new utils functions work correctly. These tests can be run locally with yarn test and will automatically run on pushes to GitHub. Eventually I'd like to expand this to integration-level testing (i.e. testing the DataHarmonizer class itself), but that will involve a bit more setup and this is a good first step.

Fixes #330