In the new DataServices regime, especially with the existence of schema for ReducedDatums, the way we handle DataProducts and DataProcessors, needs to be refactored.
Current relationship between ReducedDatums, DataProducts, and Data Processors:
We need some way to translate data from a given source (currently this is often ingested as a DataProduct) from its ingested format into individual Reduced Datums that can be stored, displayed, and analyzed by the TOM. The current process for this is inconsistent, but one scenario is as follows:
DataService Query results in Data
A data product is created with a TYPE specific to the Data Service/Query type
A data processor (Specific to the DataProduct TYPE) is used to translate this data into a format expected by the TOM
Users can overwrite this processor
Proposed refactor:
We propose that in v3.0 we introduce a DEFAULT Data Schema that can serve as an intermediate data state between Data Services and User defined Reduced Datums. Fundamentally, the idea is that a DatService would contain all of the data processing functions required to convert a query output or DataProduct upload into the Default Schema, and then the User defined processors could convert perform any reductions or analysis required and produce the final Reduced Datum with a potentially different schema.
In the new DataServices regime, especially with the existence of schema for ReducedDatums, the way we handle DataProducts and DataProcessors, needs to be refactored.
Current relationship between ReducedDatums, DataProducts, and Data Processors:
We need some way to translate data from a given source (currently this is often ingested as a DataProduct) from its ingested format into individual Reduced Datums that can be stored, displayed, and analyzed by the TOM. The current process for this is inconsistent, but one scenario is as follows:
Proposed refactor:
We propose that in v3.0 we introduce a DEFAULT Data Schema that can serve as an intermediate data state between Data Services and User defined Reduced Datums. Fundamentally, the idea is that a DatService would contain all of the data processing functions required to convert a query output or DataProduct upload into the Default Schema, and then the User defined processors could convert perform any reductions or analysis required and produce the final Reduced Datum with a potentially different schema.