Closed jacklovell closed 6 years ago
Its going to be difficult to support this type of multiprocessing without allowing the whole scene-graph to be pickled. Whilst pickle works automatically on normal python objects, its more limited to simple cython objects. Because cython is used extensively throughout Raysect core, we will need to manually add __getstate__()
and __setstate__()
methods to all the scene-graph objects. This should be a longer term goal anyway. Let's add this to our next release plan.
Issue #207 is related to this issue but will be absorbed by this issues, since its broader.
My use case: I have a TargettedPixel observer which samples a grid cell with 1000 rays. I repeat this for every cell in my reconstruction grid to build up a volumetric sensitivity matrix for tomographic inversions. I am currently attempting to do this for a large grid (751263 cells). The nature of the problem means that each individual observation is pretty quick, but there are a lot of observations.
I'm currently using the MulticoreEngine render engine for this, which sets up and tears down worker processes for each observation. Because the observations are so quick, this is leading to significant overheads, as is seen from one of the jobs I submitted, which was using 16 worker processes for each observation:
(this job terminated prematurely for an unrelated reason).
I'd like to change the way the problem is chunked:
concurrent.futures.ProcessPoolExecutor
to parallelise the loop over the grid cells.The problem I run into now is that the mesh describing the vessel surfaces, which needs to be available to each process in the pool, can't be pickled:
I've tested this with a simple example, which also fails:
Is it feasible to implement
__getstate__
and__setstate__
for theMeshData
class to enable the mesh to be pickled?