MDCHAMP / FreeLunch

Meta-heuristic optimisation suite for python
https://pypi.org/project/freelunch/
MIT License
46 stars 3 forks source link

Multiproc call signiture and Pool parallelism #33

Closed MDCHAMP closed 2 years ago

MDCHAMP commented 2 years ago

PR for #32

codecov-commenter commented 2 years ago

Codecov Report

Merging #33 (69f5878) into main (21da558) will increase coverage by 0.02%. The diff coverage is 98.27%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #33      +/-   ##
==========================================
+ Coverage   98.55%   98.57%   +0.02%     
==========================================
  Files          17       17              
  Lines        1105     1121      +16     
==========================================
+ Hits         1089     1105      +16     
  Misses         16       16              
Impacted Files Coverage Δ
tests/test_benchmarking.py 71.42% <0.00%> (ø)
src/freelunch/base.py 100.00% <100.00%> (ø)
src/freelunch/util.py 96.66% <100.00%> (ø)
tests/test_base.py 100.00% <100.00%> (ø)
tests/test_optimisers.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 21da558...69f5878. Read the comment docs.

MDCHAMP commented 2 years ago

Naive approach based on just taking the raw objective function and not counting nfes...

The trouble is that wrapping the objective function is a nice way to do handling of bad obj scores and nfe counting but then the objs cannot be pickled for IPC.

There are ways around this:

But these will require a medium sized amount of work (changes to every optimiser.run method, tech and base at a minimum.... )

The other major issue is that the curent API more or less forces end users to wrap their objs before passing to freelunch if they want to write good code.

I guess some sort of freelunch-specific warning when a wrapped function is provided to the multiproc API could at least let users know that they need to provide a pickle-able obj?

All this is a ballache.... unless I'm fundamentally missunderstanding something about multiprocessing/pickle?

MDCHAMP commented 2 years ago

Ok MWE here using functools.partial that solves the issue with only a little bit of an onus on the user to write different code...

I'll work tomorrow on implementing this in the PR. Definitely going to need some documentation / example code for this. Might even be worth catching the inevitable AttributeError that gets thrown when the pickler can't handle the wrapped functions and include a link to the docs..


import numpy as np
from functools import partial
from multiprocessing import Pool

# MODULE SIDE CODE

def obj_wrapper(obj, opt_inst, x):
    opt_inst.nfe += 1
    print(opt_inst.nfe)
    ret = obj(x)
    if ret > 9.5:
        ret = 0
    else:
        ret = 1
    return ret

class optimiser:

    def __init__(self, obj):
        self.nfe = 0
        self.obj = partial(obj_wrapper, obj, self)

    def run(self):
        return [self.obj(np.random.uniform(0,1)) for _ in range(3)], self.nfe

    def __call__(self, runs):
        return Pool(3).starmap(self.run, [() for _ in range(runs)])

# USER SIDE CODE

y_tgt = 10 # some prediction / hyperparameter needed for objective function logic

def my_predict(x): # some functional logic needed for objective function logic
    return x

# Obvious pythonic way to do it... doesn't work... obviously

def wrap_my_obj(predictor, hyper):
    def obj(x):
        y_hat = predictor(x)
        return y_tgt-y_hat
    return obj

not_mp_safe_obj = wrap_my_obj(my_predict, y_tgt)

# Onus is on user to prepare functions like this for multiproc...

def my_obj(predictor, hyper, x): # objective function with too many / invalid args for freelunch
    y_hat = predictor(x)
    return y_tgt-y_hat

mp_safe_obj = partial(my_obj, my_predict, y_tgt) # get a multiproc safe obj without wrapping

# RUNTIME
if __name__ == '__main__':

    opt = optimiser(mp_safe_obj)
    for a in opt(3):
        print(a)

# ([0, 1, 1], 3)
# ([0, 1, 1], 3)
# ([1, 1, 0], 3)

...python man

MDCHAMP commented 2 years ago

Functionality is there at last.

Just need to add some documentation and something to the readme to cover the new functionality and we are good to go!