Closed Corwinpro closed 5 years ago
I should have not put these two changes in one commit, as they fix different problems.
Merging #48 into master will increase coverage by
1.64%
. The diff coverage is95%
.
@@ Coverage Diff @@
## master #48 +/- ##
==========================================
+ Coverage 78.09% 79.74% +1.64%
==========================================
Files 28 29 +1
Lines 557 617 +60
Branches 30 40 +10
==========================================
+ Hits 435 492 +57
- Misses 116 117 +1
- Partials 6 8 +2
Impacted Files | Coverage Δ | |
---|---|---|
...t_tools/gradient_consistency/taylor_convergence.py | 95% <95%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 9e6517a...be0bd8f. Read the comment docs.
Regarding the gradients, we have an open issue in the BDSS: force-h2020/force-bdss/issues/184
This seems like a very good place to start thinking about how to propagate them through the MCO. It's likely that it may require some additional changes to be made to the force_bdss
code base to get a working implementation, so we will probably need to make several PRs to do so.
(I will move this comment to the force-bdss
repo issue if necessary)
I have the following general feeling regarding an optimization interface.
run
in case of DataSource
s) should have the following interface for a scalar-valued objective:Immutable = Tuple[ModelDataValue]
Objective = DataValue(type="OBJECTIVE_TYPE")
ParameterVector = List[DataValue(type="PARAMETER_TYPE")]
ParameterGradient = List[
DataValue(
type="OBJECTIVE_TYPE / PARAMETER_TYPE"
)
]
class DataSource:
def run(
self,
model: Immutable,
parameters: ParameterVector
) -> Tuple[Objective, ParameterGradient]:
Here Objective
is a scalar value, ParameterVector
and ParameterGradient
are of the same length.
The type of the ParameterGradient
is defined by the types of the chosen Objective
and the ParameterGradient
.
For multiple objectives, the type of Objective
should be:
Objective = List[DataValue(type="OBJECTIVE_TYPE")]
and, therefore, the ParameterGradient
changes to
ParameterGradient = List[
List[
DataValue(
type="OBJECTIVE_TYPE / PARAMETER_TYPE"
)
]
]
ParameterVector
are containers: the output ParameterGradient
should have the same .shape
and the type, defined by the objective and the inputs.When KPI
's are defined as a combination of several objectives, for instance:
simple_kpi = linear_kpi_mapping(raw_objectives) # a linear function
fancy_kpi = nonlinear_kpi_mapping(raw_objectives) # a non-linear function
the corresponding gradient is simply:
simple_kpi_gradient = linear_kpi_mapping(raw_gradients) # a linear function
fancy_kpi_gradient = nonlinear_kpi_mapping(raw_objectives, raw_gradients) # a non-linear function
Moreover, it is clear that both the linear_kpi_mapping
and the nonlinear_kpi_mapping
can also be implemented as DataSource
's. These are just the run
methods.
The interface is consistent from a formal point of view: we always now how the changes in the parameters
values affect the Objective
value (the gradient information)
It is possible to apply extra layers of almost zero-cost operations when a combination of raw objectives is required. e.g. to calculate the KPIs
Easy to test gradients consistency with Taylor Tests.
Easy to test physical types consistency (no sums of lb
with inches
)
@flongford Thoughts?
@Corwinpro - This comment has some very good ideas that touch upon a lot of issues - it would be an excellent idea to reference as a new issue for the force-bdss
repo, where we can discuss them further.
This pull request implement the so-called "Taylor test", which verifies the consistency between a function that returns a scalar value ("an objective") and a gradient function, which presumably implements the jacobian correctly.
The new class
TaylorTest
accepts an unbound method for the function, an unbound method for the gradient, and the input dimension. The function output must be a scalar. I guess that the only public method should berun_taylor_test
, which returns a list ofslopes
corresponding for each input parameter. Theslope
is a power fit of the Taylor residuals versus the perturbation amplitude. It must be greater than 2, otherwise thegradient
implementation is wrong, i.e., doesn't match the providedfunction
.Related issues:
Bug in gradient #45
To do:
step_size
) is too small, and the problem arises due to the machine precision errors.