Is your feature request related to a problem? Please describe.
Yes, currently the run_data_processor function in the tom_dataproducts module does not support dynamic selection of data processors based on user supplied data_product_type. The data type is tied to the data product. This limitation restricts the flexibility needed in our project that require different processing strategies for different types of data products, especially when these decisions need to be made dynamically from a user interface.
Describe the solution you'd like
I propose extending the run_data_processor function by adding an optional parameter that allows overriding the data product type. This will enable selecting a specific processor at runtime. It would be a small PR of a couple of changes in the function to make sure the correct data_type is referenced and users can still call processor.data_type_override(). Our project requires this override to be applied before the processor is selected, so we cannot use that override option.
def run_data_processor(dp, dp_type_override=None):
"""
Reads the `data_product_type` from the dp parameter or the override and imports the
corresponding `DATA_PROCESSORS` specified in
`settings.py`, then runs `process_data` and inserts the returned values into the
database.
:param dp: DataProduct which will be processed into a list
:type dp: DataProduct
:param dp_type_override: Optional. DataProduct type to override with. If None, the type from the `dp` object is used.
:type dp_type_override: str, optional
:returns: QuerySet of `ReducedDatum` objects created by the `run_data_processor` call
:rtype: `QuerySet` of `ReducedDatum`
"""
Additional context
This feature was developed and tested in our local project setup where it proved to be crucial for allowing front-end users to select different processing strategies for different datasets.
If the TOMToolkit team is open to this, I can have a PR submitted for review immediately.
Is your feature request related to a problem? Please describe. Yes, currently the
run_data_processor
function in thetom_dataproducts
module does not support dynamic selection of data processors based on user supplieddata_product_type
. The data type is tied to the data product. This limitation restricts the flexibility needed in our project that require different processing strategies for different types of data products, especially when these decisions need to be made dynamically from a user interface.Describe the solution you'd like I propose extending the
run_data_processor
function by adding an optional parameter that allows overriding the data product type. This will enable selecting a specific processor at runtime. It would be a small PR of a couple of changes in the function to make sure the correctdata_type
is referenced and users can still callprocessor.data_type_override()
. Our project requires this override to be applied before the processor is selected, so we cannot use that override option.Additional context This feature was developed and tested in our local project setup where it proved to be crucial for allowing front-end users to select different processing strategies for different datasets.
If the
TOMToolkit
team is open to this, I can have a PR submitted for review immediately.