clawpack / visclaw

Clawpack visualization tools
http://www.clawpack.org
BSD 3-Clause "New" or "Revised" License
29 stars 48 forks source link

Matplotlib Parallel Plotting Capabilities #170

Closed mandli closed 8 years ago

mandli commented 8 years ago

This PR should act as a prototype for using multiple processes for plotting in VisClaw. The basic methodology goes like this:

  1. First call to plotclaw.py from the command line (or elsewhere) checks to see if the plotdata object's plotdata.parallel is set to True. If this is the case and the additional argument frames to plotclaw.plotclaw is None then the routine assumes that this is the first call to the plot function and plays the role dividing up the frames among all the requested plotdata.num_procs.
  2. A new subprocess is spawned for each process but with an additional argument to plotclaw.py that includes the frames it should plot.
  3. The originating process waits around while the other processes plot the data.

This is admittedly very "hacky" but due to limitations in the multiprocessing module due to pickling this was the one way I could figure out how to do this with a minimal amount of intrusiveness into VisClaw.

mandli commented 8 years ago

If you want to test this out you need to add the following lines to any setplot.py:

plotdata.parallel = True
plotdata.num_procs = 4
rjleveque commented 8 years ago

Thanks @mandli! I just tried this out and it seems to work great except that the _PlotIndex.html list of frames listed only frames 4,8,12,16. The animation contains all frames.

mandli commented 8 years ago

Yeah, I did not figure out yet how to only create the plot pages once and with all the frames. Also the gauges are being plotted by all the processes. We cannot assume anything about the frames requested (like saying if you are assigned frame 0 that you get to do the plotting). I am now thinking the best way to do this would be to actually not pass through a set of frames to plot but to give the process number so that we could assign certain tasks to process 0 for instance. This would require more mucking about in plotpages but is probably the way to go.

mandli commented 8 years ago

I think this is getting close to being usable. Does anyone want to test this on their problem to make sure this does not break anything for others?

ketch commented 8 years ago

With this branch checked out, even serial plotting fails for me:

examples/acoustics_1d_homogeneous - [master●] » python acoustics_1d.py htmlplot=1
2016-01-28 11:33:26,375 INFO CLAW: Solution 0 computed for time t=0.000000
2016-01-28 11:33:26,380 INFO CLAW: Solution 1 computed for time t=0.100000
2016-01-28 11:33:26,384 INFO CLAW: Solution 2 computed for time t=0.200000
2016-01-28 11:33:26,389 INFO CLAW: Solution 3 computed for time t=0.300000
2016-01-28 11:33:26,394 INFO CLAW: Solution 4 computed for time t=0.400000
2016-01-28 11:33:26,399 INFO CLAW: Solution 5 computed for time t=0.500000
2016-01-28 11:33:26,403 INFO CLAW: Solution 6 computed for time t=0.600000
2016-01-28 11:33:26,408 INFO CLAW: Solution 7 computed for time t=0.700000
2016-01-28 11:33:26,413 INFO CLAW: Solution 8 computed for time t=0.800000
2016-01-28 11:33:26,418 INFO CLAW: Solution 9 computed for time t=0.900000
2016-01-28 11:33:26,423 INFO CLAW: Solution 10 computed for time t=1.000000
Executed setplot successfully
/bin/sh: function: No such file or directory
Will plot 11 frames numbered: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Will make 1 figure(s) for each frame, numbered:  [1]

-----------------------------------

Creating html pages for figures...

Directory '/Users/ketch/Research/Software/clawpack/pyclaw/examples/acoustics_1d_homogeneous/_plots'
    already exists, files may be overwritten
Now making png files for all figures...

-----------------------------------

Creating latex file...
Directory '/Users/ketch/Research/Software/clawpack/pyclaw/examples/acoustics_1d_homogeneous/_plots'
    already exists, files may be overwritten

Latex file created:
  /Users/ketch/Research/Software/clawpack/pyclaw/examples/acoustics_1d_homogeneous/_plots/plots.tex

Use pdflatex to create pdf file
Traceback (most recent call last):
  File "acoustics_1d.py", line 143, in <module>
    output = run_app_from_main(setup,setplot)
  File "/Users/ketch/Research/Software/clawpack/clawpack/pyclaw/util.py", line 129, in run_app_from_main
    pyclaw.plot.html_plot(outdir=outdir,setplot=setplot)
  File "/Users/ketch/Research/Software/clawpack/clawpack/pyclaw/plot.py", line 76, in html_plot
    iplot=False)
  File "/Users/ketch/Research/Software/clawpack/clawpack/pyclaw/plot.py", line 63, in plot
    setplot=setplot_func)
  File "/Users/ketch/Research/Software/clawpack/clawpack/visclaw/plotclaw.py", line 127, in plotclaw
    plotpages.plotclaw_driver(plotdata, verbose=False, format=format)
  File "/Users/ketch/Research/Software/clawpack/clawpack/visclaw/plotpages.py", line 2666, in wrapper
    return f(*args, **kwds)
  File "/Users/ketch/Research/Software/clawpack/clawpack/visclaw/plotpages.py", line 2977, in plotclaw_driver
    im = plt.imshow(Image.imread(filenames[0]))
IndexError: list index out of range

I'm not sure where the failing shell command is.

mandli commented 8 years ago

I also found this on a different machine. It must have something to do with the way we are constructing the frame list.

mandli commented 8 years ago

So I fixed the shell command problem by checking both the parallel attribute and that the number of processors requested is not zero. The index out of range problem is still there however and I have a sneaking suspicion it has to do with the private status variable. @rjleveque can you take a look and see what may be wrong here?

mandli commented 8 years ago

Nevermind on the shell problem, it removes the problem for serial but introduces it in parallel. It is telling the shell problem appears 4 times when using num_procs = 4.

rjleveque commented 8 years ago

Here's the problem with the shell command: plotclaw.py assumes setplot is a file name but from @ketch's PyClaw example it is a function that's defined in the script. The line

                plot_cmd = "%s %s %s %s" % (plotclaw_cmd,
                                            outdir,
                                            plotdir,
                                            setplot)

constructs something like

plot_cmd =  python plotclaw.py .....  <function <lambda> at 0x102cf2ed8> 5

which can't be passed to the shell.

The fix is to make sure plotdata.parallel = False in this case. I'll do a PR to @mandli's branch.

If we want to support parallel plotting when plotdata.setplot is a function, we'll have to do something different than the current use of subprocess.

ketch commented 8 years ago

@mandli Have you considered using pathos.multiprocessing or IPython.parallel (both of which can use dill instead of pickle)? See http://matthewrocklin.com/blog/work/2013/12/05/Parallelism-and-Serialization/.

mandli commented 8 years ago

Yeah, I quickly ran into the pickle problem. I tried pathos.multiprocessing but could not get it to work. I had not considered IPython.parallel though, something to look into.

ketch commented 8 years ago

I'm fine with this being merged, but we should add an issue to the tracker for making it work generally.