uqfoundation / pathos

parallel graph management and execution in heterogeneous computing
http://pathos.rtfd.io
Other
1.38k stars 89 forks source link

Request: starmap method for functions with multiple arguments #185

Open scottshambaugh opened 4 years ago

scottshambaugh commented 4 years ago

Multiprocessing since 3.3 has included starmap and starmap_async methods, which are the same as map() and map_async() except that the elements of the iterable are expected to be iterables that are unpacked as arguments. Hence an iterable of [(1,2), (3, 4)] results in [func(1,2), func(3,4)].

As seen in this stackoverflow thread, it's not too hard to work around by using wrapper functions, but this shorthand would be pretty convenient. Note that a pathos suggestion was given as one of the answers on that stackoverflow question, but that it doesn't actually fit the shape of the data being provided. Currently, passing *iterable as an input would result in [func(1,3), func(2,4)] instead of the desired behavior above.

mmckerns commented 4 years ago

@scottshambaugh: thanks for pointing this out. I just updated the SO answer (repeated below). pathos does have a starmap, but not on it's primary pool objects (only on the pools that begin with '_'.

>>> def add(*x):
...   return sum(x)
... 
>>> x = [[1,2,3],[4,5,6]]
>>> import pathos
>>> import numpy as np
>>> # use ProcessPool's map and transposing the inputs
>>> pp = pathos.pools.ProcessPool()
>>> pp.map(add, *np.array(x).T)
[6, 15]
>>> # use ProcessPool's map and a lambda to apply the star
>>> pp.map(lambda x: add(*x), x)
[6, 15]
>>> # use a _ProcessPool, which has starmap
>>> _pp = pathos.pools._ProcessPool()
>>> _pp.starmap(add, x)
[6, 15]
>>> 

Yeah, starmap should probably be migrated to all pools.

scottshambaugh commented 4 years ago

Awesome, and thank you for those workarounds! The lambda is a lot cleaner than what I've been using.