e-merlin / eMERLIN_CASA_pipeline

This is CASA eMERLIN pipeline to calibrate data from the e-MERLIN array. Please fork the repository before making any changes and read the Coding Practices page in the wiki. Please add issues with the pipeline in the issues tab.
GNU General Public License v3.0
14 stars 11 forks source link

Aoflagger performance checks #96

Closed jradcliffe5 closed 6 years ago

jradcliffe5 commented 6 years ago

Just a quick enquiry.

Javier, have you checked the performance of aoflagger with regards to splitting the flagging into subbands? Especially when you have enough ram to throw all of the dataset in

jmoldon commented 6 years ago

Yes, I made differents tests. These are the conclussions:

So the default in the pipeline is to divide in bands. The only case I saw it being slower is with very small data sets (about or less than 50GB) in which the overheads for opening the file so many times, num_sources x bands are greater than openning it only num_sources times. For those cases you can select in the inputs file what to do: flag_aoflagger = 1 means process bands individually flag_aoflagger = 2 means process all bands together see documentation

This is an example of the system usage in the case of a big file (225GB fits) and enough memory (256GB). There are 5 sources, three small at the beginning (about first 10 min), one medium phasecal (about 20 min) and the target source (the rest):

Iterations, one spw at a time (1h06m46s) individual_spw

No iterations, all spw together (1h4m10s) all_spw

jradcliffe5 commented 6 years ago

Ok great. Thanks Javier! I'll close this issue then. Just to let you know, I'm going to edit the front page for information on easy installing of aoflagger/wsclean using the conda install's I have.