e-merlin / eMERLIN_CASA_pipeline

This is CASA eMERLIN pipeline to calibrate data from the e-MERLIN array. Please fork the repository before making any changes and read the Coding Practices page in the wiki. Please add issues with the pipeline in the issues tab.
GNU General Public License v3.0
14 stars 11 forks source link

Alternatives to casaplotms! #84

Closed jradcliffe5 closed 5 years ago

jradcliffe5 commented 6 years ago

Does anyone know about any alternatives to casaplotms that are available? The pipeline's serious bottleneck seems to be the plotting mechanisms. This may be prohibitive especially when reducing L-band data which is about 300GB per 24hrs.

I'm sure that LOFAR must have some alternative that we could port.

jradcliffe5 commented 6 years ago

@jmoldon could we use this? At least on the pipeline version run as user support https://github.com/haavee/jiveplot

jmoldon commented 6 years ago

Summary: plotting now much faster, by a factor of 5-10 more or less. It has multiprocessing to plot all fields in parallel, but currently forced to 1 process because I have seen it to fail.

@jradcliffe5 you may be interested in the lastest version (v0.7.3 from d02c073acacf0bc108195fc54d840f18e2271ec5). I have rewritten all the steps related to plotting visibilities (plot_data, plot_corrected and uvplots). One of the main limitations were the number of loops (4 x field x baseline), so I have reduced that by using iteraxis='baseline'. I didn't like that because it deforms the x axis labels for times sometimes, but it is much faster, of course.

For example, for an L band dataset, 24h, but averaged to 2s and 128chan/spw. The time for plor_data and plot_corrected has been reduced from more than 8 hours to about 1h. (The uvplts do not improve because all baselines are plotted together). Depending on the file I see improvements between a factor of 5 and 10 in time. If you can do some tests, could you check if this new approach reduces the times to something feasible also for your data sets?

Also, I have pushed it even more by using multiprocessing. Because each plot was not really filling the memory, disk or cpu, I tried plotting all fields in parallel. That put down the timing to about 30 minutes instead of 8h! That also improves the uvplt step, although not so much.

However, the multiprocessing seems to fail in some cases. For example, when I imaged a single dataset I had no problem. But when launching several pipelines sequentially, the plotting stopped without crashing, with this error:

QDBusError("org.freedesktop.DBus.Error.InvalidSignature", "Unexpected reply signature: got "b" (bool),
 expected "s" (QString)") 
Type= 18 
Timeout= 12000000 true  (B) 
QtTripped= true  version= 264197 

The error is a mystery. I don't know the reason. I suspect different reason can cause it: maybe it has to do with different the processes trying to use all the memory? But I don't see an increase in memory. Maybe it is a problem trying to read a MS several times in parallel? But I didn't have problems with a big dataset in my computer. @varenius any idea on that?

In any case, current version will only use 1 process at a time. If you want to make it faster, you can change lines 195 and 234 of eMERLIN_CASA_plots.py or call pool = multiprocessing.Pool(num_proc) without specifying any num_proc. This is the part of the code:

https://github.com/e-merlin/CASA_eMERLIN_pipeline/blob/916b4076aff50bbb7db5f464a4d251f78932fb20/functions/eMERLIN_CASA_plots.py#L195

Also, this is the new format for the visibility plots, not what I would prefer but still good: http://almanas.jb.man.ac.uk/jmoldon/tests/casa_pipeline/test7/weblog/plots_corrected_1407+284.html

varenius commented 6 years ago

This sounds amazing, well done!

Regarding the error; I don't have an answer. The only thing I can think of is that some QT-wrapper opened by the first plotting is not thread-safe. I remember running into a similar bug a few weeks ago when trying to produce matplotlib plots with multiprocess. I got tkinter-related errors. Since you have QString in the error, I guess plotms uses QT as GUI api, but the error may be of similar origin. Perhaps running this with a computer using a newer version of the QT-libs could help? Given that this issue has been addressed, that is.

jmoldon commented 6 years ago

Good guess, maybe it is a QT problem and may be related to #76 if something is still not closing correctly. The weird thing is that I have seen it when I send different casa+pipeline commands in bash on different data sets, but not necessarily when I run one casa+pipeline and when it is finished I run another one, although both things should be equivalent.

Also, I'm running casa 4.7 (because of the problems with gaincal). I still don't know if it happens in later versions.

I should try different combinations of plots, versions, batch processing, etc to see when this happens, and to see if it is related to QT not completely closing everything on time or something else. If you have similar problems, let me know. Remember that you need num_proc to be None (the default) for that to happen.

jmoldon commented 5 years ago

More efficient plots are used. No plans to implement external tools.