CMA-ES / pycma

Python implementation of CMA-ES
Other
1.12k stars 179 forks source link

Speedup the function evaluation, is parallel running available? #276

Open Landau1908 opened 1 week ago

Landau1908 commented 1 week ago

Hi,

During the cma optimizer running, I found the function evaluation takes long time which is the real bottleneck compraed to the retry process. Almost one hour is needed to perform an iteration. My question is how to speedup function evaluation process. is parallel running available regarding to it? Below is the putputs from each iterations:

Iterat #Fevals   function value  axis ratio  sigma  min&max std  t[m:s]
    1     10 1.160898614632199e+03 1.0e+00 4.71e-01  3e-01  3e-01 51:22.9
time: 3096.090343400021
    2     20 3.793098844987039e+02 1.1e+00 4.57e-01  3e-01  3e-01 95:13.0
time: 5726.111344399978
    3     30 1.480916877638177e+03 1.1e+00 4.70e-01  3e-01  3e-01 142:17.6
time: 8550.687494100071
    4     40 9.431285926411842e+02 1.2e+00 4.77e-01  3e-01  3e-01 183:22.2
time: 11015.306137099979

Regards.

nikohansen commented 1 week ago

You could use an cma.optimization_tools.EvalParallel2 instance, either as input to cma.fmin2 or within an ask-and-tell loop to compute the f-values passed to tell.

import cma

with cma.optimization_tools.EvalParallel2(cma.ff.elli) as p_objective:
    x, es = cma.fmin2(None, 3 * [1], 1,
                      parallel_objective=p_objective)

See

import cma

cma.optimization_tools.EvalParallel2?
Init signature:
cma.optimization_tools.EvalParallel2(
    fitness_function=None,
    number_of_processes=None,
)
Docstring:     
A class and context manager for parallel evaluations.

This class is based on the ``Pool`` class of the `multiprocessing` module.

The interface in v2 changed, such that the fitness function can be
given once in the constructor. Hence the number of processes has
become the second (optional) argument of `__init__` and the function
has become the second and optional argument of `__call__`.

To be used with the `with` statement (otherwise `terminate` needs to
be called to free resources)::

    with EvalParallel2(fitness_function) as eval_all:
        fvals = eval_all(solutions)

assigns a callable `EvalParallel2` class instance to ``eval_all``.
The instance can be called with a `list` (or `tuple` or any
sequence) of solutions and returns their fitness values. That is::

    eval_all(solutions) == [fitness_function(x) for x in solutions]

`EvalParallel2.__call__` may take three additional optional arguments,
namely `fitness_function` (like this the function may change from call
to call), `args` passed to ``fitness`` and `timeout` passed to the
`multiprocessing.pool.ApplyResult.get` method which raises
`multiprocessing.TimeoutError` in case.

``eval_all = EvalParallel2(fitness_function, 0)`` bypasses
`multiprocessing`, hence the construct can be used even when
`multiprocessing` fails on this `fitness_function` instantiation.

Examples:

>>> from cma.optimization_tools import EvalParallel2
>>> for n_jobs in [None, -1, 0, 1, 2, 4]:
...     with EvalParallel2(cma.fitness_functions.elli, n_jobs) as eval_all:
...         res = eval_all([[1,2], [3,4]])
>>> # class usage, don't forget to call terminate
>>> ep = EvalParallel2(cma.fitness_functions.elli, 4)
>>> [float(v) for v in ep([[1,2], [3,4], [4, 5]])]  # doctest:+ELLIPSIS
[4000000.944...
>>> ep.terminate()
...
>>> # use with `with` statement (context manager)
>>> es = cma.CMAEvolutionStrategy(3 * [1], 1, dict(verbose=-9))
>>> with EvalParallel2(cma.fitness_functions.elli,
...                    number_of_processes=12) as eval_all:
...     while not es.stop():
...         X = es.ask()
...         es.tell(X, eval_all(X, args=(1e1,)))  # `eval_all` also accepts
...                                               # `fitness_function` as
...                                               # (optional) keyword argument
>>> assert es.result[1] < 1e-13 and es.result[2] < 1500

Parameters: the `EvalParallel2` constructor takes the number of
processes as optional input argument, which is by default
``multiprocessing.cpu_count()``. If ``number_of_processes <= 0``, no
`multiprocessing` is invoked and the fitness is computed directly in a
regular loop.

Limitations: the `multiprocessing` module, on which this class is based
upon, may not work with certain class instance methods or Cython
instances, or class instances that contain modules as it uses `pickle`.

Details: in some cases the execution may be considerably slowed down,
as for example in previous tests done with test suites from coco/bbob.

Comparing setting ``number_of_processes = 0`` with
``number_of_processes = 1`` evaluates the overhead introduced by
``multiprocessing.Pool.apply_async``.