charmplusplus / charm4py

Parallel Programming with Python and Charm++
https://charm4py.readthedocs.io
Apache License 2.0
289 stars 21 forks source link

AssertionError: Run with more than 1 PE to use charm.pool #184

Open karankakwani opened 3 years ago

karankakwani commented 3 years ago

When I run the example given here on my system, I get the following error(s):- Example: https://github.com/UIUC-PPL/charm4py/blob/master/examples/parallel-map/square.py

image

ZwFink commented 3 years ago

Can you paste the command you used to produce this output?

karankakwani commented 3 years ago

I just ran this snippet in pycharm.

ZwFink commented 3 years ago

Aah, I see. When starting Charm4Py scripts, you actually want to run the charmrun.start program that becomes available when you installed Charm4Py. Here is an example of the command I used to run the above square.py script using 4 cores on my machine: python3 -m charmrun.start +p4 ./square.py

The +p4 argument is used to specify that 4 PEs should be used for execution. You can think of PEs as processes that run on a core of your machine, so 4 PEs will use 4 cores.

In the example when charm.pool is used, 1 PE is reserved as a worker, according to the documentation. Therefore, when using charm.pool on a machine with n cores, you may want consider passing n+1 as the value to the +p argument shown above.

As for running this in PyCharm, you might be able to set up the run configuration such that it will allow you to run Charm4Py programs directly from within the IDE: https://www.jetbrains.com/help/pycharm/creating-and-editing-run-debug-configurations.html

karankakwani commented 3 years ago

python3 -m charmrun.start +p4 ./square.py Is this the only way to run it in order to exploit the benefits of charm4py? How to do this in a case where there is a main program that launches a child process and this child process needs to run in parallel?

karankakwani commented 3 years ago

@ZwFink So I tried the way you mentioned above in my own program and getting this error:-

image image

ZwFink commented 3 years ago

How to do this in a case where there is a main program that launches a child process and this child process needs to run in parallel? Can you explain what you mean by this a little further?

Regarding the timeout error, this issue happens because charmrun uses ssh to start processes. For this to work, you need to be enable passwordless ssh for your machine, so that running ssh localhost doesn't require a password. What operating system are you using?

ZwFink commented 3 years ago

I should mention that this is a gap in the Charm4Py documentation and will make sure it gets added.

karankakwani commented 3 years ago

@ZwFink Thanks for your reply and clarification.

To explain further on my doubt earlier, what I meant is:-

I have a main script say main.py This main.py spawns a child process say childprocess.py and now this child process needs to exploit charm4py to achieve faster execution.

E.g. main.py

child_process = Process(target=(...), args=(...))
child_process.start()
<some other code>
child_process.join()

How to use charm4py in this case?

ZwFink commented 3 years ago

Is this an existing codebase where it's not possible/feasible to make main.py a Charm4Py program and perform the processing of both child/parent processes therein? Doing this would be nice as the Charm++ scheduler can schedule all of the work the program does, which may improve performance. The fork/join model you propose above is trivial to express in Charm4Py.

Otherwise, in the above example can child_process not simply be a call to invoke a Charm4Py program such as python3 -m charmrun.start +p4 ./square.py?

karankakwani commented 3 years ago

"The fork/join model you propose above is trivial to express in Charm4Py."

Can you please share an example?

ZwFink commented 3 years ago

Sure! I transformed your example above into a Charm4Py program:


from charm4py import charm, Chare, Future, Array, Reducer
import time

class ChildProcess(Chare):
    def __init__(self, arg, doneFuture):
        self.arg = arg
        self.doneFuture = doneFuture

    @coro
    def start(self):
        # our "work" here is just to perform a reduction and send it to the
        # future, but anything can be done
        self.reduce(self.doneFuture, self.thisIndex[0], Reducer.sum)

        # if the main process doesn't need the result, we may simply do:
        # self.reduce(self.doneFuture)

def main(args):
    childProcessFuture = Future()

    childProcessArg = 1
    # child_process = Process(target=(...), args=(...))
    # Here we specify that 20 chares will be created to perform the work.
    childProc = Array(ChildProcess, 20, args=[childProcessArg, childProcessFuture])

    # alternatively, the constructor of ChildProcess can call doWork which saves some message passing
    # this represents the child process creation that does the work
    # child_process.start()
    childProc.start()

    # some other code, in this case just sleep
    # <some other code>
    time.sleep(3)

    # similar to 'child_process.join()' in the example you provided
    # child_process.join()
    childResult = childProcessFuture.get()

    print(f'Child result: {childResult}')
    charm.exit()

charm.start(main)

In the above example, the code executed by the childProc.start() call will happen asynchronously and in parallel with the code executed in <some other code>, which is what you were achieving above. The work will be distributed among the processors that you use for execution. The call to childProcessFuture.get() is blocking and will return as soon as the array childProc has finished processing.

After saving the above in example.py, the following command executes the above code using 4 processors: python3 -m charmrun.start +p4 ./example.py. I can also provide an example using charm.pool if you think that would be good to see.

ZwFink commented 3 years ago

@karankakwani Just following up on this. Have you been able to use something similar to the above to solve your problem?