Corwinpro commented 5 years ago

This pull request implement the so-called "Taylor test", which verifies the consistency between a function that returns a scalar value ("an objective") and a gradient function, which presumably implements the jacobian correctly.

The new class TaylorTest accepts an unbound method for the function, an unbound method for the gradient, and the input dimension. The function output must be a scalar. I guess that the only public method should be run_taylor_test, which returns a list of slopes corresponding for each input parameter. The slope is a power fit of the Taylor residuals versus the perturbation amplitude. It must be greater than 2, otherwise the gradient implementation is wrong, i.e., doesn't match the provided function.

Related issues:

Bug in gradient #45

To do:

Add specific edge cases when the Taylor residuals are too small. This can be because of: -- the original function is linear at the evaluation point, -- the perturbation (step_size) is too small, and the problem arises due to the machine precision errors.

Corwinpro commented 5 years ago

I should have not put these two changes in one commit, as they fix different problems.

codecov-io commented 5 years ago

Codecov Report

Merging #48 into master will increase coverage by 1.64%. The diff coverage is 95%.

@@            Coverage Diff             @@
##           master      #48      +/-   ##
==========================================
+ Coverage   78.09%   79.74%   +1.64%     
==========================================
  Files          28       29       +1     
  Lines         557      617      +60     
  Branches       30       40      +10     
==========================================
+ Hits          435      492      +57     
- Misses        116      117       +1     
- Partials        6        8       +2

Impacted Files	Coverage Δ
...t_tools/gradient_consistency/taylor_convergence.py	`95% <95%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 9e6517a...be0bd8f. Read the comment docs.

flongford commented 5 years ago

Regarding the gradients, we have an open issue in the BDSS: force-h2020/force-bdss/issues/184

This seems like a very good place to start thinking about how to propagate them through the MCO. It's likely that it may require some additional changes to be made to the force_bdss code base to get a working implementation, so we will probably need to make several PRs to do so.

Corwinpro commented 5 years ago

(I will move this comment to the force-bdss repo issue if necessary)

I have the following general feeling regarding an optimization interface.

A solving method (run in case of DataSources) should have the following interface for a scalar-valued objective:

Immutable = Tuple[ModelDataValue]
Objective = DataValue(type="OBJECTIVE_TYPE")
ParameterVector = List[DataValue(type="PARAMETER_TYPE")]
ParameterGradient = List[
    DataValue(
        type="OBJECTIVE_TYPE / PARAMETER_TYPE"
    )
]

class DataSource:
    def run(
        self, 
        model: Immutable, 
        parameters: ParameterVector
    ) -> Tuple[Objective, ParameterGradient]:

Here Objective is a scalar value, ParameterVector and ParameterGradient are of the same length.

The type of the ParameterGradient is defined by the types of the chosen Objective and the ParameterGradient.
For multiple objectives, the type of Objective should be:

Objective = List[DataValue(type="OBJECTIVE_TYPE")]

and, therefore, the ParameterGradient changes to

ParameterGradient = List[
    List[
        DataValue(
            type="OBJECTIVE_TYPE / PARAMETER_TYPE"
        )
    ]
]

A similar extension can be done for the case, when the input parameters ParameterVector are containers: the output ParameterGradient should have the same .shape and the type, defined by the objective and the inputs.

Working with KPIs

When KPI's are defined as a combination of several objectives, for instance:

simple_kpi = linear_kpi_mapping(raw_objectives) # a linear function
fancy_kpi = nonlinear_kpi_mapping(raw_objectives) # a non-linear function

the corresponding gradient is simply:

simple_kpi_gradient = linear_kpi_mapping(raw_gradients) # a linear function
fancy_kpi_gradient = nonlinear_kpi_mapping(raw_objectives, raw_gradients) # a non-linear function

Moreover, it is clear that both the linear_kpi_mapping and the nonlinear_kpi_mapping can also be implemented as DataSource's. These are just the run methods.

Advantages

The interface is consistent from a formal point of view: we always now how the changes in the parameters values affect the Objective value (the gradient information)
It is possible to apply extra layers of almost zero-cost operations when a combination of raw objectives is required. e.g. to calculate the KPIs
Easy to test gradients consistency with Taylor Tests.
Easy to test physical types consistency (no sums of lb with inches)

@flongford Thoughts?

flongford commented 5 years ago

@Corwinpro - This comment has some very good ideas that touch upon a lot of issues - it would be an excellent idea to reference as a new issue for the force-bdss repo, where we can discuss them further.

force-h2020 / force-bdss-plugin-itwm-example

Tests/gradient consistency #48

Related issues:

To do:

Codecov Report

Working with KPIs

Advantages