uqfoundation / pathos

parallel graph management and execution in heterogeneous computing
http://pathos.rtfd.io
Other
1.39k stars 89 forks source link

Is there always a local node created when using ParallelPool #108

Closed chengs closed 7 years ago

chengs commented 7 years ago

Hello

I am trying the ParallelPool part in Pathos. I open a remote ppserver on my remote node and build an SSH tunnel to connect the server. Here is my code

from pathos.core import connect
from pathos.parallel import stats
from pathos.parallel import ParallelPool as Pool

def host(id):
    import socket
    import time
    time.sleep(1.0)
    return "Rank: %d -- %s" % (id, socket.gethostname())

tunnel = connect('remoteserver', port=35000)
print tunnel

pool = Pool(servers=('localhost:%d' % tunnel._lport,))

res5 = pool.map(host, range(10))
print(pool)
print('\n'.join(res5))
print(stats(pool))
print('')

I get the following results:

Job execution statistics:
 job count | % of all jobs | job time sum | time per job | job server
         1 |         10.00 |       1.0034 |     1.003420 | localhost:63044
         9 |         90.00 |       9.1333 |     1.014809 | local
Time elapsed since server creation 2.072026968
0 active tasks, 8 cores

I see that localhost:63044 is the tunnel for the remote server. However, there is also a local node which handles most of the requests, even I don't add it in the servers variable. Could you please tell me why? How can I stop the creation of this local node?

BTW, pathos is a very effective tool. The multiprocess part is very helpful.

mmckerns commented 7 years ago

You can explicitly set the number of nodes, so giving nodes=0 when you create the pool should work. You can also modify this after pool creation, as pool.nodes. Note also you can manually tweak the pool.servers after creation as well.

mmckerns commented 7 years ago

I'm closing this due to staleness in the comments. Please reopen it if you have further follow-up.