tBuLi / symfit

Symbolic Fitting; fitting as it should be.
http://symfit.readthedocs.org
MIT License
233 stars 17 forks source link

picking error ScipyMinimize #187

Closed Jhsmit closed 5 years ago

Jhsmit commented 5 years ago

When I use the following code:


from symfit import Fit, Parameter, Variable, CallableNumericalModel
import numpy as np
import multiprocessing as mp

x = np.arange(100).astype(float)
y = x**2
y += 0.25*y*np.random.rand(100)
a_values = np.arange(12) + 1
np.random.shuffle(a_values)

def func(x, a):
    return a * x ** 2

def gen_fit_objs(x, y, a):
    for a_i in a:
        a_par = Parameter('a', 5)
        x_var = Variable('x')
        y_var = Variable('y')

        model = CallableNumericalModel({y_var: func}, [x_var], [a_par])

        fit = Fit(model, x, a_i*y)
        yield fit

def worker(fit_obj):
    return fit_obj.execute()

if __name__ == '__main__':
    print(a_values)
    pool = mp.Pool()
    results = pool.map(worker, gen_fit_objs(x, y, a_values))
    print([res.params['a'] for res in results])

I get this output:

C:\Miniconda3\envs\cc_symfit\python.exe C:/Users/Smit/PycharmProjects/CC_Playground/dev/symfit/multiprocessing_test.py
[ 5 12  7  8  2  9 11  4  6  1 10  3]
Traceback (most recent call last):
  File "C:/Users/Smit/PycharmProjects/CC_Playground/dev/symfit/multiprocessing_test.py", line 33, in <module>
    results = pool.map(worker, gen_fit_objs(x, y, a_values))
  File "C:\Miniconda3\envs\cc_symfit\lib\multiprocessing\pool.py", line 266, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "C:\Miniconda3\envs\cc_symfit\lib\multiprocessing\pool.py", line 644, in get
    raise self._value
  File "C:\Miniconda3\envs\cc_symfit\lib\multiprocessing\pool.py", line 424, in _handle_tasks
    put(task)
  File "C:\Miniconda3\envs\cc_symfit\lib\multiprocessing\connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "C:\Miniconda3\envs\cc_symfit\lib\multiprocessing\reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
AttributeError: Can't pickle local object 'ScipyMinimize.list2kwargs.<locals>.wrapped_func'

Replacing pool.map with map does give the desired result, but I'd like to be able to do multiprocessing.

pckroon commented 5 years ago

Is it an option to change the multiprocessing pipeline, and distribute models amongst your workers and let them make Fit objects?

Jhsmit commented 5 years ago

Yes, that works:


from symfit import Fit, Parameter, Variable, CallableNumericalModel
import numpy as np
import multiprocessing as mp
from functools import partial

x = np.arange(100).astype(float)
y = x**2
y += 0.25*y*np.random.rand(100)
a_values = np.arange(12) + 1
np.random.shuffle(a_values)

def func(x, a):
    return a * x ** 2

def worker(x, y):
    a_par = Parameter('a', 5)
    x_var = Variable('x')
    y_var = Variable('y')
    model = CallableNumericalModel({y_var: func}, [x_var], [a_par])
    fit = Fit(model, x, y)
    res = fit.execute()
    return res

if __name__ == '__main__':
    print(a_values)
    pool = mp.Pool()
    y_list = [a*y for a in a_values]
    f = partial(worker, x)
    results = pool.imap(f, y_list)
    print([r for r in results])
pckroon commented 5 years ago

Not exactly what I meant:

...
def worker(model, x, y):
    fit = Fit(model, x, y)
    return fit.execute()

...
Jhsmit commented 5 years ago

I had the above code lying around because I already tried that. Passing models also works:


from symfit import Fit, Parameter, Variable, CallableNumericalModel
import numpy as np
import multiprocessing as mp
from functools import partial

x = np.arange(100).astype(float)
y = x**2
y += 0.25*y*np.random.rand(100)
a_values = np.arange(12) + 1
np.random.shuffle(a_values)

def func(x, a):
    return a * x ** 2

def worker(x, tuple_y_model):
    y, model = tuple_y_model
    fit = Fit(model, x, y)
    res = fit.execute()
    return res

def gen_models(n):
    for i in range(n):
        a_par = Parameter('a', 5)
        x_var = Variable('x')
        y_var = Variable('y')

        model = CallableNumericalModel({y_var: func}, [x_var], [a_par])
        yield model

if __name__ == '__main__':
    print(a_values)
    pool = mp.Pool()
    y_list = [a*y for a in a_values]

    f = partial(worker, x)
    results = pool.imap(f, zip(y_list, gen_models(len(y_list))))
    print([r for r in results])

I've put the y data and the models in a zip here because otherwise its a hassle to get everything to work.

pckroon commented 5 years ago

Ok then :) As a bottom line: Objectives and Minimizers are not (yet) picklable. I'll close this now with that conclusion. Feel free to make a PR I'd say ;)

tBuLi commented 5 years ago

Although we have a workable solution, I wouldn't consider this issue closed. This is as good a reminder as any that obejectives and minimizers should still be made pickalable.

The MWE example given above could also serve as a potential test to add.

tBuLi commented 5 years ago

This has been fixed in 0.4.6, I pretty much added your example as a test to make sure we really call multiprocessing as well, not just pickle.