uqfoundation / pathos

parallel graph management and execution in heterogeneous computing
http://pathos.rtfd.io
Other
1.38k stars 89 forks source link

Pool not returning iterator #142

Closed dterg closed 6 years ago

dterg commented 6 years ago

Whereas this works with multiprocessing (i.e. counter is printed):

counter = 0

for _ in pool.map(extract, filelist):
    if show_progress:
        counter += 1
        print("%i/%i" % (counter + 1, nfiles)) 

When using pathos ProcessingPool() its as if an iterator is not returned as the loop does not progress (and print() does not get executed).

mmckerns commented 6 years ago

Can you provide a minimal but complete bit of code that demonstrates what you are reporting? That will help me address the issue.

dterg commented 6 years ago

Sure thing - although I should correct my initial post that pathos' map() resembles multiprocessing's map() in terms of functionality. What I was describing/trying to achieve is the functionality of multiprocessing map_unordered() - which I haven't come across in pathos.


import time
from pathos.multiprocessing import Pool
#from multiprocessing import Pool

def main():
    counter = 0
    # originally filelist is a list of file directories
    filelist = range(0,1000)
    pool = Pool(3)
    show_progress = True

    # replace with pool.map_unordered from multiprocessing to achieve printing after each job before pool finishes
    for _ in pool.map(extract, filelist):
        if show_progress:
            counter += 1
            print("%i/%i" % (counter, len(filelist)))

def extract(file):
    #file would be opened and closed in the original function; here just replacing it with a sleep
    time.sleep(30)

if __name__ == "__main__":
    main()`
mmckerns commented 6 years ago

you are looking for uimap...

>>> from pathos.pools import ProcessPool
>>> pool = ProcessPool(3)
>>> res = pool.uimap(lambda x:x*x, range(3))
>>> next(res)
0
>>> next(res)
1
>>> next(res)
4
>>> 

If this doesn't resolve your issue, please reopen and comment.