Closed mdhaber closed 1 year ago
Interesting idea. This would solve the issue of parallelism on the stat function itself. To solve the problem on sobol_indices
we would call this from within the function correct? With func
being the user function. I am not sure if this simplifies much things compare to calling our mapper though.
Or this function can be used by a user to not use the vectorisation on the number of output but rely on this.
To solve the problem on sobol_indices we would call this from within the function correct?
This was not the intent. The idea is that the user would use it to parallelize their function before passing it into sobol_indices
. That is because multiprocessing can't help the code inside sobol_indices
much at all because it's so simple and vectorized. The only potential use for multiprocessing is in vectorizing the user function.
We could have sobol_indices
accept a workers
parameter and then use Parallelizer
on the function, but it would not be so much harder for the user to do it themselves. Since multiprocessing introduces pain points (e.g. needing if __name__ == '__main__'
, function pickleability) I think the users need to take some responsibility for it rather than hiding everything inside the SciPy functions. And if users have trouble, I'd rather get bug reports about Parallelizer
than function-specific bug reports that boil down to multiprocessing issues.
With func being the user function. I am not sure if this simplifies much things compare to calling our mapper though.
Do you mean MapWrapper
?
This simplifies things compared to using MapWrapper
directly. It takes one line rather than many.
Or this function can be used by a user to not use the vectorisation on the number of output but rely on this.
No, I don't think that is the intent. This is for use especially when func
is already vectorized.
@tupui this is what I had in mind for multiprocessing: instead of adding
workers
to every function that accepts a callable (e.g. the resampling methods,sobol_indices
) have the user wrap their callable for multiprocessing (since that's the only thing that might be expensive enough to make multiprocessing worth it).