force-h2020 / force-bdss

Business Decision System general interface
BSD 2-Clause "Simplified" License
2 stars 2 forks source link

The DataSources gradients implementation and gradient propagation #233

Open Corwinpro opened 5 years ago

Corwinpro commented 5 years ago

This issue is related to #184 , #200 .

The optimization interface lacks a rigorous rule of how the DataSource run parameters and the returned gradient should be related.

My first suggestions are:

class GradientDataSource(DataSource):
    pass

The specifications for this class are below.

Immutable = Tuple[ModelDataValue]
Objective = DataValue(type="OBJECTIVE_TYPE")
ParameterVector = List[DataValue(type="PARAMETER_TYPE")]
ParameterGradient = List[
    DataValue(
        type="OBJECTIVE_TYPE / PARAMETER_TYPE"
    )
]

class GradientDataSource:
    def run(
        self, 
        model: Immutable, 
        parameters: ParameterVector
    ) -> Tuple[Objective, ParameterGradient]:

Here Objective is a scalar value, ParameterVector and ParameterGradient are of the same length.

Objective = List[DataValue(type="OBJECTIVE_TYPE")]

and, therefore, the ParameterGradient changes to

ParameterGradient = List[
    List[
        DataValue(
            type="OBJECTIVE_TYPE / PARAMETER_TYPE"
        )
    ]
]

Working with KPIs and weighted objectives

I guess a linear combination of objectives is used instead of the "raw" objectives data in some situations. For instance, we might somehow weight the production_cost versus the production_time. Then, I propose to implement this as a new GradientDataSource that defines the weighting procedure.

Gradient propagation

Backpropagation. This approach to calculate the gradient of the objective(s) on the very last layer with respect to the primary parameters is robust and does not depend on the structure of the workflow.

Advantages

Related issues and pull requests

https://github.com/force-h2020/force-bdss-plugin-itwm-example/pull/48

flongford commented 5 years ago

These GradientDataDource classes would be extremely useful for DataSources where mathematical process are being performed. In which case can either numerically estimate or analytically obtain the gradient of each input slot variable (parameters) w.r.t. the output slots.

In which case the workflow becomes a network-like construct that we could model in order to produce better estimates of the MCO search direction.

However, difficulties will arise for DataSources that perform a more higher software related task, such as processing and packaging objects in containers to pass along the workflow. In which case it becomes more appropriate to simply use the MCO parameters and KPIs to construct any model of the workflow.

Corwinpro commented 5 years ago

There is a probably good first thing to do before implementing this: adding @cached to the DataSource.run methods, such that (potentially expensive) calculations are performed once.