uqfoundation / pathos

parallel graph management and execution in heterogeneous computing
http://pathos.rtfd.io
Other
1.38k stars 89 forks source link

Weird Exception propagation #166

Open pltrdy opened 5 years ago

pltrdy commented 5 years ago

Original Problem

Hey,

Thanks for your work on this repo, it's been useful! (used it because of pickle errors of threading.Pool).

I've been struggling to find out why exception raised in Pool workers wasn't propagating i.e. it was continuing silently. I added quite a lot of exceptions in my code to identify where it could eventually be caught, and found a wierd behavior.

I have a class such as:

class Pipeline:
    def __call__(self, **kwargs):
        raise ValueError()
        for module in self.modules:
            print("Pipeline: call module: %s" % str(module))
            # raise BaseException()
            raise ValueError()
            o = module(**kwargs)
            kwargs.update(o)
        return o, kwargs

which instances are called in a Pool i.e.

from pathos.multiprocessing import ProcessingPool as Pool

with Pool(processes=n_thread) as pool:
   pool.map(run_exp_args, [...])

def run_exp_args([...]):
    [...]
    p = Pipeline(...)
    o, k = p(**kwargs)

As you could expect, this raises a ValueError before entering the loop, in Pipeline.__call__ (which is confirmed by the line number of the trace), it stops the program and shows the relevant trace.

What is strange, is that, if I comment this first ValueError (but keep the one in the loop), this exception is never propagated, it's just ignored.

Now if I raise instead an BaseException by uncommenting the line right above, this exception is raised, I see the trace, but the whole program isn't stopped. I've been experimenting with n_thread=1 just to be sure my output is relevant.

Do you have any ideas?


note: the idea of raising a BaseException comes from https://stackoverflow.com/q/6728236/5903959 but the thread does not really solve the true question: why the two ValueError behave differently?


Update: The print before raise makes a difference... (and removing it eventually solves the problem)

if I remove the print in the loop, the ValueError is shown, and stops the program.

So basically:

class Pipeline:
    def __init__(self, *args):
        self.modules = [*args]

    def __call__(self, **kwargs):
        for module in self.modules:
            print()      
            raise ValueError()                     
            o = module(**kwargs)
            kwargs.update(o)                       
        return o, kwargs

runs without error, when

class Pipeline:
    def __init__(self, *args):
        self.modules = [*args]

    def __call__(self, **kwargs):
        for module in self.modules:
            raise ValueError()                     
            o = module(**kwargs)
            kwargs.update(o)                       
        return o, kwargs

fails

Note: it's not a good feeling to solve something by removing a single print statement, but in fact, it solved my problems: exceptions raised in my modules (the o = module(**kwargs) line right after the raise are behave as expected (raising and interupting the program))

pltrdy commented 5 years ago

It's still a problem, some exception raising in sub-function aren't propagated properly... it's really anoying..