caracal-pipeline / caracal

Containerized Automated Radio Astronomy Calibration (CARACal) pipeline
GNU General Public License v2.0
27 stars 6 forks source link

(lots of) flagging steps #609

Closed paoloserra closed 4 years ago

paoloserra commented 5 years ago

If I execute the flagging worker on two input .MS files with only flag_autocorr, autoflag_rfi and flagging_summary enabled I get the list of containers below.

That is a lot more containers than those I was expecting. Could you please comment on what _randocab, _flagset_clear_static_flaggingi, _flagset_clear_automatic_flaggingi and _flagset_update_automatic_flaggingi do? I'd like to document this in readthedocs.

I've also noticed that _randocab has no i index that associates it to a specific input .MS file. Is that a problem?

meerkathi - 2019-10-02 09:36:34,200 CRITICAL - INFO:STIMELA-4:Adding cab 'pserra_cab/pycasacore' to recipe. The container will be named 'rando_cab'
meerkathi - 2019-10-02 09:36:34,309 CRITICAL - INFO:STIMELA-4:Adding cab 'pserra_cab/casa_flagdata' to recipe. The container will be named 'flag_autocorr_flagging_0'
meerkathi - 2019-10-02 09:36:34,388 CRITICAL - INFO:STIMELA-4:Adding cab 'pserra_cab/pycasacore' to recipe. The container will be named 'flagset_clear_static_flagging_0'
meerkathi - 2019-10-02 09:36:34,465 CRITICAL - INFO:STIMELA-4:Adding cab 'pserra_cab/pycasacore' to recipe. The container will be named 'flagset_clear_automatic_flagging_0'
meerkathi - 2019-10-02 09:36:35,024 CRITICAL - INFO:STIMELA-4:Adding cab 'pserra_cab/autoflagger' to recipe. The container will be named 'autoflag_flagging_0'
meerkathi - 2019-10-02 09:36:35,102 CRITICAL - INFO:STIMELA-4:Adding cab 'pserra_cab/pycasacore' to recipe. The container will be named 'flagset_update_automatic_flagging_0'
meerkathi - 2019-10-02 09:36:35,239 CRITICAL - INFO:STIMELA-4:Adding cab 'pserra_cab/casa_flagdata' to recipe. The container will be named 'flagging_summary_flagging_0_'
meerkathi - 2019-10-02 09:36:35,317 CRITICAL - INFO:STIMELA-4:Adding cab 'pserra_cab/pycasacore' to recipe. The container will be named 'rando_cab'
meerkathi - 2019-10-02 09:36:35,425 CRITICAL - INFO:STIMELA-4:Adding cab 'pserra_cab/casa_flagdata' to recipe. The container will be named 'flag_autocorr_flagging_1'
meerkathi - 2019-10-02 09:36:35,503 CRITICAL - INFO:STIMELA-4:Adding cab 'pserra_cab/pycasacore' to recipe. The container will be named 'flagset_clear_static_flagging_1'
meerkathi - 2019-10-02 09:36:35,580 CRITICAL - INFO:STIMELA-4:Adding cab 'pserra_cab/pycasacore' to recipe. The container will be named 'flagset_clear_automatic_flagging_1'
meerkathi - 2019-10-02 09:36:36,118 CRITICAL - INFO:STIMELA-4:Adding cab 'pserra_cab/autoflagger' to recipe. The container will be named 'autoflag_flagging_1'
meerkathi - 2019-10-02 09:36:36,219 CRITICAL - INFO:STIMELA-4:Adding cab 'pserra_cab/pycasacore' to recipe. The container will be named 'flagset_update_automatic_flagging_1'
meerkathi - 2019-10-02 09:36:36,328 CRITICAL - INFO:STIMELA-4:Adding cab 'pserra_cab/casa_flagdata' to recipe. The container will be named 'flagging_summary_flagging_1_'
KshitijT commented 5 years ago

flagset_clear_static_flagging_i clears shadow, spw, time, autocorr, quack flagging (these are well-defined, hence static, I think) for the i'th dataset. Automatic one does the same for aoflagger, I think.

paoloserra commented 5 years ago

So, for example, if I have a dataset where shadow flagging was already done and I only want to run autocorr and AOflagger I will lose the shadow flagging?

And why is the clearing of the autoflagging not optional (unless I've missed it)? What if I'm happy with some autoflagging done with awsome_strategy.rfis, and want to flag some more with even_more_awsome_strategy.rfis?

paoloserra commented 5 years ago

@KshitijT and do you know what rando_cab does?

KshitijT commented 5 years ago

So, for example, if I have a dataset where shadow flagging was already done and I only want to run autocorr and AOflagger I will lose the shadow flagging?

Yeah, looks like that.

And why is the clearing of the autoflagging not optional (unless I've missed it)? What if I'm happy with some autoflagging done with awsome_strategy.rfis, and want to flag some more with even_more_awsome_strategy.rfis?

It could be made optional - the logic would have been to do a clean flagging, otherwise at least in Aoflagger you might end up with non-optimal flagging if you take previous flagging in account. But I agree, it should be optional - but that's an extra button.

paoloserra commented 5 years ago

I'm also confused by the following. Before flag_autocorr, quack_flagging, flag_shadow, flag_spw, flag_time, flag_scan, flag_antennas, static_mask we have the following lines:

https://github.com/ska-sa/meerkathi/blob/e72aa002176c523d6cc8c6b7425810951cce3285/meerkathi/workers/flagging_worker.py#L121-L124

After those steps we have the following lines:

https://github.com/ska-sa/meerkathi/blob/e72aa002176c523d6cc8c6b7425810951cce3285/meerkathi/workers/flagging_worker.py#L279-L282

Isn't "substep" of the latter incorrect? Should it not be substep = 'flagset_update_static_{0:s}_{1:d}'.format(wname, i) ?

Now you've explained this, I'm also confused about why in my run the static flags are not cleared before running autocorr. I.e., there is not _flagset_clear_static_flagging0 container before the _flag_autocorr_flagging0 container.

paoloserra commented 5 years ago

OK, I think that _randocab is the one that clears the static flags ...

KshitijT commented 5 years ago

I think @SpheMakh is the best person to give the details of all this, since he implemented the flagsets and stuff?

paoloserra commented 5 years ago

OK, it's just a few typos here and there, I'm correcting them. But we should discuss the strategy (and possible additional options) for clearing and updating flags.

PeterKamphuis commented 5 years ago

I think those extra steps make sure that the casa flags are updated in the BITFLAG column. I'm not sure whether when clearing all flags get cleared or just the bitflag column and the later on an incremental casa flag set gets written to the bitflag column. I did run multiple flag antennas which seemed to stick from one run to the next.

paoloserra commented 5 years ago

Thanks @PeterKamphuis . I'd still like to hear a more detailed explanation in order to include it in the documentation at readthedocs. The only bit of info I could find was in https://github.com/ska-sa/meerkathi/pull/556 , but it's not very detailed. @SpheMakh ?

paoloserra commented 4 years ago

So, for example, if I have a dataset where shadow flagging was already done and I only want to run autocorr and AOflagger I will lose the shadow flagging?

Yeah, looks like that.

@SpheMakh I'm still troubled by this. In the short term, would you be OK with me adding a use_flag_manager option in both cross_cal and flagging worker? I would set it to True by default, so nothing would change for users who are fine with the current flag manager.

o-smirnov commented 4 years ago

@SpheMakh is ignoring us. Must be a good vacation.

In the short term, would you be OK with me adding a use_flag_manager option in both cross_cal and flagging worker?

Might be a good plan. I'm getting a little disillusioned by flagsets. Those bitwise flag columns are just too damn slow to read and write. I know it's my baby, but sometimes babies do need to be flushed out with the bathwater.

paoloserra commented 4 years ago

on my to do list for this week

SpheMakh commented 4 years ago

Fixed. I've switched to the CASA flag manager in flagging and cross_cal workers. Cubical handles its own flags, so I did not include the selfcal worker.

o-smirnov commented 4 years ago

@SpheMakh I'm not sure this is implemented right with flagmanager right now. For example, I've decided to try aoflagger instead of tricolour on my targets. I run the pipeline with -sw flagging__2. I see "flagset clear automatic flagging" being executed, doing:

2020-01-06 08:43:23 INFO    flagmanager:::: flagmanager(vis="/home/oms/msdir/1562500862_sdp_l0-A3556_corr.ms",mode="delete",versionname="flagging__2_automatic",oldname="",comment="",

While this will remove the flagversion file "flagging__2_automatic", it does nothing about the current state of FLAGs in the MS (whereas note the old bitflag logic: if you remove a flagset from the bitflags, the content of that flagset is removed from FLAG).

The correct behaviour should be to (a) remove the flagset, (b) restore FLAG to the way it was at the start of the worker. We're missing (b) right now, which means there's no way to truly revert!

paoloserra commented 4 years ago

This issue is now part of #760 , closing it.