uqfoundation / pathos

parallel graph management and execution in heterogeneous computing
http://pathos.rtfd.io
Other
1.38k stars 89 forks source link

Error sending result: '[<automatic.template.BuildResult object at 0x7fc9c0127048>]'. Reason: 'ValueError('ctypes objects containing pointers cannot be pickled' #195

Closed Maximilianxu closed 4 years ago

Maximilianxu commented 4 years ago

I installed pathos through python3-pip, so drill, pathos.multiprocessing dependecies were installed automatically.

The subprocess I run is

def func()
    mod = tvm.build(...)
    return mod

tvm.build() is a compilation function implemented by a FFI invocation of C++. I tried using a Python class BuildResult to wrap mod, and the same issues arised.

Are there any suggestions? Thanks a lot... I tried parallize the above code two days and it is very annoying...

Also, with the result list returned by "imap", using get(timeout=...) cannot terminate the subprocess when the running time goes beyond the timeout limit. With the mapresult returned by "amap", however, get(timeout=...) works by terminating all the subprocesses, but I cannot obtain the returned values given by some subprocesses that didn't raise the TimeoutError exception.

mmckerns commented 4 years ago

@Maximilianxu: There's not a lot of information in the above that can help me make a suggestion for how to potentially have your code be successful in running in parallel, if it is possible. The error is a serialization error, which comes from not being able to serialize the pointer objects. Without more information, I'd say there's no way around that for the ProcessingPool (I assume that's what you are using). You might try the ThreadPool or the ProcessPool -- as they both use different serialization. I wouldn't generally expect that the object you are trying to send would serialize well, unless you did some very hands on low-level work with shared memory.

Maximilianxu commented 4 years ago

@mmckerns Thanks, you are right. Actually, I already tried ProcessPool, ParallelPool, etc, but none of them works.

Finally, I work around this problem by serializing the tvm Module to a file (this is a ... kind of easy method but I didn't notice that I can do this...) and just return the file name to the main process.

Thanks a lot anyway.

mmckerns commented 4 years ago

Yep, that works. Good solution.