joblib / loky

Robust and reusable Executor for joblib
http://loky.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
520 stars 47 forks source link

Worker pool initialising #413

Open kskadart opened 8 months ago

kskadart commented 8 months ago

Hey folks, Probably the topic for the stackoverflow post, let's see.

I have a web service and it contains some GPU calculations based on the original ProcessPoolExecutor object and I can see that a calculation part has significant memory leaks. I start use the Loky solution and I can see that a memory leak is gone. But memory leaks gone because the loky executor has timeout for worker and it reinitialises whole pool by timeout and it looks very good for my.

The issue with initialising function (initializer attribute in the get_reusable_executor func). I put in a initialer a custom function that create sessions with GPU and with the loky it takes 20 sec VS 3 sec with an original ProcessPoolExecutor. It means that first request to the service takes +20 sec.

I tried to avoid an initialiser but a custom object can't work in a pool process by shared memory for some reason... For example:

some_module.py
effects_list.append((...))
effects_library = EffectsLibrary(effects_list, ...)

def calculator():
    effects_library.apply(...)

So to use the effects_library object I use an initializer

def init_libs:
    global effects_library
    effects_list.append((...))
    effects_library = EffectsLibrary(effects_list, ...)
 ....
executor = get_reusable_executor(..., initializer=init_libs)

Probably you can suggest some workarounds how I can use the timeout option and reinitialise worker before the service get first request.