TOMToolkit / tom_base

The base Django project for a Target and Observation Manager
https://tom-toolkit.readthedocs.io
GNU General Public License v3.0
26 stars 46 forks source link

Move Data Processors to be internal to Data Services #1095

Open jchate6 opened 3 weeks ago

jchate6 commented 3 weeks ago

In the new DataServices regime, especially with the existence of schema for ReducedDatums, the way we handle DataProducts and DataProcessors, needs to be refactored.

Current relationship between ReducedDatums, DataProducts, and Data Processors:

We need some way to translate data from a given source (currently this is often ingested as a DataProduct) from its ingested format into individual Reduced Datums that can be stored, displayed, and analyzed by the TOM. The current process for this is inconsistent, but one scenario is as follows:

Proposed refactor:

We propose that in v3.0 we introduce a DEFAULT Data Schema that can serve as an intermediate data state between Data Services and User defined Reduced Datums. Fundamentally, the idea is that a DatService would contain all of the data processing functions required to convert a query output or DataProduct upload into the Default Schema, and then the User defined processors could convert perform any reductions or analysis required and produce the final Reduced Datum with a potentially different schema.