ratt-ru / CubiCal

A fast radio interferometric calibration suite.
GNU General Public License v2.0
18 stars 13 forks source link

Applying calibration from averaged ms leads to 0's in corrected visibilities #336

Closed PeterKamphuis closed 4 years ago

PeterKamphuis commented 4 years ago

I have tried to apply a calibration table from a ms where the frequency channels are averaged to the un-averaged ms. However, this results in 0's in the corrected visibilities in areas where there seem to be no solutions due to flags in the averaged set (Or simply no solutions). For more background see https://github.com/ska-sa/meerkathi/issues/627

I tested by having an averaged set of GMRT-GSB data in which 4 channels are binned together. This set is then additionally flagged and calibrated with cubical with the settings listed at the bottom of this post.

I have applied the calibration both with load-from with the same solution intervals (scaled by 4 in frequency), xfer-from with the same solution intervals and xfer-from with interpolating to 1 time-int and 1 freq-int. The latter is the worst with large 0 padding all around the data (see below). I have tried in the meerkathi-master as well as in my own branch and a standalone cubical installed from the master today. All give the same result when not interpolating but slightly different results when interpolating.

Averaged dataset corrected visibilities baseline 0-1 and non averaged with calibration aplied from load-from: xfer-from no interpolation and xfer-from with interpolation

Parsets to make the calibration table:

_Help = Visibility data options
ms = ../msdir/Quick_22Sep-2012AP_corravg.ms
column = DATA
time-chunk = 64
freq-chunk = 0
rebin-time = 1
rebin-freq = 1
chunk-by = SCAN_NUMBER
chunk-by-jump = 1
single-chunk = 
single-tile = -1

[sel]
_Help = Data selection options
field = 0
ddid = None
taql = 
chan = 
diag = True

[out]
_Help = Options for output products
dir = cubical
name = Create_Alone
overwrite = True
backup = 1
mode = sc
apply-solver-flags = True
column = CORRECTED_DATA
derotate = None
model-column = 
weight-column = 
reinit-column = False
subtract-model = 0
subtract-dirs = :
plots = 1
casa-gaintables = True

[model]
_Help = Calibration model options
list = MODEL_DATA
ddes = never
beam-pattern = None
beam-l-axis = None
beam-m-axis = None
feed-rotate = auto
pa-rotate = True

[montblanc]
_Help = Montblanc simulation options
device-type = CPU
dtype = float
mem-budget = 1024
verbosity = WARNING
threads = 0
pa-rotate = None

[weight]
_Help = Weighting options
column = WEIGHT
fill-offdiag = False
legacy-v1-2 = False

[flags]
_Help = General flagging options
apply = -cubical
auto-init = legacy
save = cubical
save-legacy = auto
reinit-bitflags = False
warn-thr = 0.3
see-no-evil = 0

[degridding]
_Help = Options for the degridder. Only in use when predicting from DicoModels using DDFacet
OverS = 11
Support = 7
Nw = 100
wmax = 0.0
Padding = 1.7
NDegridBand = 16
MaxFacetSize = 0.25
MinNFacetPerAxis = 1
NProcess = 8

[postmortem]
_Help = Options for "postmortem" flagging based on solution statistics
enable = False
tf-chisq-median = 1.2
tf-np-median = 0.5
time-density = 0.5
chan-density = 0.5
ddid-density = 0.5

[madmax]
_Help = Options for the "Mad Max" flagger
enable = 1
residuals = 0
estimate = corr
diag = True
offdiag = True
threshold = [0, 10]
global-threshold = [0, 12]
plot = 1
plot-frac-above = 0.01
plot-bl = 
flag-ant = 0
flag-ant-thr = 5

[sol]
_Help = Solution options which apply at the solver level
jones = G
precision = 32
delta-g = 1e-06
delta-chi = 1e-06
chi-int = 5
last-rites = True
stall-quorum = 0.99
term-iters = [5, 0]
min-bl = 210.0
max-bl = 0
subset = 

[bbc]
_Help = Options for baseline-based corrections (a.k.a. BBCs, a.k.a. interferometer gains).
load-from = 
compute-2x2 = False
apply-2x2 = False
save-to = bbc-gains-2-Quick_26Sep-2012AP_corravg.parmdb
per-chan = True
plot = True

[dist]
_Help = Parallelization and distribution options
ncpu = 6
nworker = 0
nthread = 0
max-chunks = 4
min-chunks = 0
pin = 0
pin-io = False
pin-main = io

[log]
_Help = Options related to logging
memory = True
stats = chi2:.3f
stats-warn = chi2:10
boring = True
append = False
verbose = 0
file-verbose = None

[debug]
_Help = Debugging options for the discerning masochist
pdb = False
panic-amplitude = 0.0
stop-before-solver = False
escalate-warnings = False

[misc]
_Help = Miscellaneous options
random-seed = None
parset-version = 0.1

[JONES-TEMPLATE]
_Help = Options for {LABEL}-Jones term
_NameTemplate = {LABEL}
_ExpandedFrom = --sol-jones
_OtherTemplates = _Help:label
label = {LABEL}
solvable = 1
type = complex-2x2
load-from = 
xfer-from = 
save-to = {out[name]}-{JONES}-field_{sel[field]}-ddid_{sel[ddid]}.parmdb
dd-term = False
fix-dirs = 
update-type = full
time-int = 1
freq-int = 1
max-prior-error = 0.1
max-post-error = 0.1
low-snr-warn = 75
high-gain-var-warn = 30
clip-low = 0.1
clip-high = 10.0
clip-after = 5
max-iter = 20
epsilon = 1e-06
delta-chi = 1e-06
conv-quorum = 0.99
ref-ant = None
prop-flags = default
estimate-pzd = False
diag-only = 0
offdiag-only = False
robust-cov = compute
robust-scale = 1
robust-npol = 2
robust-int = 1
robust-save-weights = 0

[g]
_Help = Options for G-Jones term
label = G
solvable = 1
type = phase-diag
load-from = 
xfer-from = 
save-to = Input_Avg_Data.parmdb
dd-term = 0
fix-dirs = 
update-type = phase-diag
time-int = 4
freq-int = 64
max-prior-error = 0.25
max-post-error = 0.25
low-snr-warn = 75
high-gain-var-warn = 30
clip-low = 0.1
clip-high = 2.5
clip-after = 5
max-iter = 20
epsilon = 1e-06
delta-chi = 1e-06
conv-quorum = 0.99
ref-ant = None
prop-flags = default
estimate-pzd = False
diag-only = 0
offdiag-only = False
robust-cov = compute
robust-scale = 1
robust-npol = 2
robust-int = 1
robust-save-weights = 0
_Templated = True

[de]
_Templated = 1
dd-term = 1
clip-low = 0.0
clip-high = 0
delta-chi = 1e-05
max-prior-error = 0.44
max-post-error = 0.44

Parset for applying with load-from:


_Help = Visibility data options
ms = ../msdir/Quick_22Sep-2012AP_corr.ms
column = DATA
time-chunk = 64
freq-chunk = 0
rebin-time = 1
rebin-freq = 1
chunk-by = SCAN_NUMBER
chunk-by-jump = 1
single-chunk = 
single-tile = -1

[sel]
_Help = Data selection options
field = 0
ddid = None
taql = 
chan = 
diag = True

[out]
_Help = Options for output products
dir = cubical
name = LoadFromOutput
overwrite = True
backup = 1
mode = ac
apply-solver-flags = True
column = CORRECTED_DATA
derotate = None
model-column = 
weight-column = 
reinit-column = False
subtract-model = 0
subtract-dirs = :
plots = 1
casa-gaintables = False

[model]
_Help = Calibration model options
list = 
ddes = never
beam-pattern = None
beam-l-axis = None
beam-m-axis = None
feed-rotate = auto
pa-rotate = True

[montblanc]
_Help = Montblanc simulation options
device-type = CPU
dtype = float
mem-budget = 1024
verbosity = WARNING
threads = 0
pa-rotate = None

[weight]
_Help = Weighting options
column = WEIGHT
fill-offdiag = False
legacy-v1-2 = False

[flags]
_Help = General flagging options
apply = -cubical
auto-init = legacy
save = cubical
save-legacy = auto
reinit-bitflags = False
warn-thr = 0.3
see-no-evil = 0

[degridding]
_Help = Options for the degridder. Only in use when predicting from DicoModels using DDFacet
OverS = 11
Support = 7
Nw = 100
wmax = 0.0
Padding = 1.7
NDegridBand = 16
MaxFacetSize = 0.25
MinNFacetPerAxis = 1
NProcess = 8

[postmortem]
_Help = Options for "postmortem" flagging based on solution statistics
enable = False
tf-chisq-median = 1.2
tf-np-median = 0.5
time-density = 0.5
chan-density = 0.5
ddid-density = 0.5

[madmax]
_Help = Options for the "Mad Max" flagger
enable = 1
residuals = 0
estimate = corr
diag = True
offdiag = False
threshold = [0, 10]
global-threshold = [0, 12]
plot = 0
plot-frac-above = 0.01
plot-bl = 
flag-ant = 0
flag-ant-thr = 5

[sol]
_Help = Solution options which apply at the solver level
jones = G
precision = 32
delta-g = 1e-06
delta-chi = 1e-06
chi-int = 5
last-rites = True
stall-quorum = 0.99
term-iters = [5, 0]
min-bl = 0.0
max-bl = 0
subset = 

[bbc]
_Help = Options for baseline-based corrections (a.k.a. BBCs, a.k.a. interferometer gains).
load-from = 
compute-2x2 = False
apply-2x2 = False
save-to = file
per-chan = True
plot = True

[dist]
_Help = Parallelization and distribution options
ncpu = 6
nworker = 0
nthread = 0
max-chunks = 0
min-chunks = 0
pin = 0
pin-io = False
pin-main = io

[log]
_Help = Options related to logging
memory = True
stats = chi2:.3f
stats-warn = chi2:10
boring = False
append = False
verbose = 0
file-verbose = None

[debug]
_Help = Debugging options for the discerning masochist
pdb = False
panic-amplitude = 0.0
stop-before-solver = False
escalate-warnings = False

[misc]
_Help = Miscellaneous options
random-seed = None
parset-version = 0.1

[JONES-TEMPLATE]
_Help = Options for {LABEL}-Jones term
_NameTemplate = {LABEL}
_ExpandedFrom = --sol-jones
_OtherTemplates = _Help:label
label = {LABEL}
solvable = 1
type = complex-2x2
load-from = 
xfer-from = 
save-to = {out[name]}-{JONES}-field_{sel[field]}-ddid_{sel[ddid]}.parmdb
dd-term = False
fix-dirs = 
update-type = full
time-int = 1
freq-int = 1
max-prior-error = 0.1
max-post-error = 0.1
low-snr-warn = 75
high-gain-var-warn = 30
clip-low = 0.1
clip-high = 10.0
clip-after = 5
max-iter = 20
epsilon = 1e-06
delta-chi = 1e-06
conv-quorum = 0.99
ref-ant = None
prop-flags = default
estimate-pzd = False
diag-only = 0
offdiag-only = False
robust-cov = compute
robust-scale = 1
robust-npol = 2
robust-int = 1
robust-save-weights = 0

[g]
_Help = Options for G-Jones term
label = G
solvable = 1
type = complex-2x2
load-from = Input_Avg_Data.parmdb
xfer-from = 
save-to = {out[name]}-{JONES}-field_{sel[field]}-ddid_{sel[ddid]}.parmdb
dd-term = 0
fix-dirs = 
update-type = phase-diag
time-int = 4
freq-int = 256
max-prior-error = 0.1
max-post-error = 0.1
low-snr-warn = 75
high-gain-var-warn = 30
clip-low = 0.1
clip-high = 10
clip-after = 5
max-iter = 20
epsilon = 1e-06
delta-chi = 1e-06
conv-quorum = 0.99
ref-ant = None
prop-flags = default
estimate-pzd = False
diag-only = 0
offdiag-only = False
robust-cov = compute
robust-scale = 1
robust-npol = 2
robust-int = 1
robust-save-weights = 0
_Templated = True

[de]
_Templated = 1
dd-term = 1
clip-low = 0.0
clip-high = 0
delta-chi = 1e-05
max-prior-error = 0.44
max-post-error = 0.44```
JSKenyon commented 4 years ago

@o-smirnov This seems familiar - is it one of the things you fixed in your branch?

o-smirnov commented 4 years ago

Very possible. It rings quite a few bells. @PeterKamphuis, would it be easy for you to repeat the test with the issue-326-chisq-jsk branch?

PeterKamphuis commented 4 years ago

@o-smirnov Depends how easily I can install the branch but I think it should not be a problem as I installed from the GitHub distribution already. I'll let you know tomorrow.

o-smirnov commented 4 years ago

If you did pip install -e, then you only need to git checkout the branch.

Cheers, Oleg


Sent from my phone. Quality of spelling inversely proportional to finger size.

On Thu, 30 Jan 2020, 09:13 Peter Kamphuis, notifications@github.com wrote:

@o-smirnov https://github.com/o-smirnov Depends how easily I can install the branch but I think it should not be a problem as I installed from the GitHub distribution already. I'll let you know tomorrow.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ratt-ru/CubiCal/issues/336?email_source=notifications&email_token=ABRLTP4BSM7KF27BAQSRYT3RAJ42XA5CNFSM4KMZB6A2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKJ5VWY#issuecomment-580115163, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABRLTP5FKHSOIJCB6DAFOVDRAJ42XANCNFSM4KMZB6AQ .

PeterKamphuis commented 4 years ago

@o-smirnov Is that branch supposed to be python3 compatible? I got an error:

Initially trying to apply INFO 11:37:10 - main [0.1/0.1 0.6/0.6 0.2Gb] Exiting with exception: TypeError(list indices must be integers or slices, not str)

Which I figured might be a mismatch between the calibration database and the version or some such. Trying to remake the calibration resulted in the error.

INFO 11:40:27 - main [io] [0.2/0.3 0.9/0.9 0.3Gb] I/O handler for load 0 save None failed with exception: '>' not supported between instances of 'method' and 'int'

Which both seem exactly the kind of thing python3 is more asinine about?

PeterKamphuis commented 4 years ago

Using Python 3.6.9 BTW

o-smirnov commented 4 years ago

It supposed to be py3 compatible, but I suppose I could always have f*cked it up...

Can you rerun it serially for me please (--dist-ncpu 1)? It ought to give a more informative stack trace then.

PeterKamphuis commented 4 years ago

Applying is the same but I'll post the full trace back:

INFO      12:00:15 - main               [0.1/0.1 0.6/0.6 0.2Gb] Exiting with exception: TypeError(list indices must be integers or slices, not str)
 Traceback (most recent call last):
  File "CubiCal/cubical/main.py", line 480, in main
    global_options=GD, jones_options=jones_opts)
  File "CubiCal/cubical/machines/abstract_machine.py", line 645, in create_factory
    return machine_cls.Factory(machine_cls, *args, **kw)
  File "CubiCal/cubical/machines/abstract_machine.py", line 770, in __init__
    self.init_solutions()
  File "CubiCal/cubical/machines/abstract_machine.py", line 790, in init_solutions
    self.machine_class.exportable_solutions())
  File "/home/peter/GitHub/CubiCal/cubical/machines/abstract_machine.py", line 838, in _init_solutions
    self._init_sols[label] = param_db.load(filename), prefix, interpolate
  File "CubiCal/cubical/param_db.py", line 50, in load
    db._load(filename)
  File "CubiCal/cubical/database/pickled_db.py", line 267, in _load
    parm._paste_slice(item)
  File "CubiCal/cubical/database/parameter.py", line 293, in _paste_slice
    grid_index = self.grid_index[axis]
TypeError: list indices must be integers or slices, not str

Creating I get the same error as wel:

INFO      12:03:07 - main               [0.3/0.3 0.9/0.9 0.3Gb] Exiting with exception: TypeError('>' not supported between instances of 'method' and 'int')
 Traceback (most recent call last):
  File "CubiCal/cubical/main.py", line 546, in main
    stats_dict = workers.run_process_loop(ms, tile_list, load_model, single_chunk, solver_type, solver_opts, debug_opts, out_opts)
  File "CubiCal/cubical/workers.py", line 216, in run_process_loop
    return _run_single_process_loop(ms, load_model, single_chunk, solver_type, solver_opts, debug_opts, out_opts)
  File "CubiCal/cubical/workers.py", line 347, in _run_single_process_loop
    tile.load(load_model=load_model)
  File "CubiCal/cubical/data_handler/ms_tile.py", line 928, in load
    angles = self.dh.parallactic_machine.rotation_angles(subset.time_col)
  File "CubiCal/cubical/machines/parallactic_machine.py", line 108, in rotation_angles
    if log.verbosity > 1:
TypeError: '>' not supported between instances of 'method' and 'int'

I checked that it is only using a single cpu now. Sorry for not posting the full traceback right away.

PeterKamphuis commented 4 years ago

Ok, after changing log.verbosity to log.verbosity() here https://github.com/ratt-ru/CubiCal/blob/dfc1f393c05cf06d5c13e090b45cf096ffa84e26/cubical/machines/parallactic_machine.py#L108 the branch runs to create a calibration db which then can be applied. However, the chi2 stats are suddenly all around 2 whereas they are around 1 with the master branch.

In any case applying the new calibration in this branch results in exactly the same 0 padding in the corrected visibilities.

PeterKamphuis commented 4 years ago

Running everything on a single cpu now, just to make sure there is no issue there.

JSKenyon commented 4 years ago

@o-smirnov Any progress on this? I am not as well versed in the interpolation code. With the exception of the heavily zeroed case (I think that this is likely solutions which were bad in the first time interval being applied to all time intervals) it looks like we are failing to raise flags for corrected data where we had failed solutions. It is debatable what should be placed in corrected data if calibration has failed. @SpheMakh advocates simply writing out the uncorrected data. Currently I believe we take the simpler approach of just multiplying in the zero gains, producing these zeros in the output. I am not against this, but it does seem like we are failing to flag the data appropriately.

PeterKamphuis commented 4 years ago

@JSKenyon Correct me if I am wrong but when using xfer-from the bad solutions should be inferred/interpolated from the correct ones, right? Even at the start or end of the observations. So after interpolation there should be no zero-gains left I thought?

Additionally, as the frequency solutions are only split in two (64 channels in averaged case, 256 in apply) I am very surprised to see the the padding behaviour occur when interpolating as it does not follow the calibration interval. I don't see why a bad solution at the start would affect half of the calibration interval but not the other half. My first guess would be it has more to do with the interpolation and the flagged edges.

I think when there are no solutions the data should just be flagged also when just applying the table. The logic being that if you had used Cubical to calibrate and not just apply such solutions would also be flagged.

bennahugo commented 4 years ago

This may (and I stress may) have been the casacore reading bug which read 0's and nans from the data and model columns. Please try master and see if the problem is solved or not

bennahugo commented 4 years ago

(alternatively try stimela master)

PeterKamphuis commented 4 years ago

I will check tomorrow.

TariqBlecher commented 4 years ago

I am getting the same problem when I apply the solutions to the same dataset for which I solved for.

Pre-Cubical data column: image Post Cubical Data: image

TariqBlecher commented 4 years ago

I suspect cubical is not writing flags properly. I am using the master branch. My parset is: [data] ms = 1july-v2.ms column = DATA time-chunk = 1000 freq-chunk = 262 chunk-by=SCAN_NUMBER chunk-by-jump = 0

[model] list = continuum_tagged.lsm.html@dE

[montblanc] dtype = double feed-type = circular mem-budget = 4096

[sol] jones = G,dE term-iters = 30,20

[out] column = CORRECTED_DATA overwrite=True mode = sr subtract-dirs = : [g] time-int = 40 update-type =phase-diag

[de] time-int = 1000 freq-int = 262 dd-term=True update-type =diag [dist] ncpu = 6

PeterKamphuis commented 4 years ago

I do not see how it could be, but maybe we should check it is not a GMRT data set issue only as I am also using GMRT data.

TariqBlecher commented 4 years ago

Yes, I agree that this could be the case. One thing I did notice was that cubical seemed to seeing 32 stations e.g.

D0T1F0 Stations 30, 31 (2/32) fully flagged due to low SNR.

However there are only 30 stations in the array

PeterKamphuis commented 4 years ago

@TariqBlecher Yes, GMRT data has two ghost telescopes. It is the same in casa. If I remember correctly they are for testing correlator noise and such. In anycase I only use the inner core (14 antennas) for my test set and the same thing happens.

ratt-priv-ci commented 4 years ago

I want to ask if the ANTENNA1 and ANTENNA2 column correctly labelled? I know HERA does (or used to) not adhere to the Standard v2.0 specification (CASA Memo 229) and used to store antenna numbers in that column instead of foreign keys to the ::ANTENNA table. That we cannot and will not support - it is an observatory mistake. Also note that when you merge datasets with CASA funny things happen. You may have more antennas if the ECEF positions were updated or have antennas that were not common to both datasets (e.g. 16 + 14 will show 16 in the concat dataset if the positions are the same).

On Thu, Mar 12, 2020 at 11:36 AM Tariq Blecher notifications@github.com wrote:

Yes, I agree that this could be the case. One thing I did notice was that cubical seemed to seeing 32 stations e.g.

D0T1F0 Stations 30, 31 (2/32) fully flagged due to low SNR.

However there are only 30 stations in the array

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ratt-ru/CubiCal/issues/336#issuecomment-598092185, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEIVPJSU57OBEPRTFF2EXUDRHCUITANCNFSM4KMZB6AQ .

--


Benjamin Hugo

Junior Software Developer SARAO Black River Park, 2 Fir Street, Observatory, Cape Town, Western Cape, 7925 Contact: [+27] 0716293858 <+27%2071%20629%203858>

PhD. student, Radio Astronomy Techniques and Technologies, Department of Physics and Electronics, Rhodes University

Skype: benna.cn

PeterKamphuis commented 4 years ago

I'm not sure what you mean? That memo says:

ANTENNA1 Int     First antenna
ANTENNA2 Int     Second antenna

And my dataset has that in those columns. But you are saying that they should not be used for the antenna numbers but for foreign keys in the antenna table?

bennahugo commented 4 years ago

it is index keys (integers) so e.g. index 0 will correspond to row 0 of the ::ANTENNA table for which you can read the station name.

Specification 229 (https://casa.nrao.edu/Memos/229.html):

ANTENNAn
    Antenna number ($ \geq$ 0), and a direct index into the ANTENNA sub-table rownr. For n > 2, triple-product data are implied. 
bennahugo commented 4 years ago

But if you confirm these are just ghost positions in the ANTENNA subtable then it explains your error - there is no data and therefore no SNR to solve.

bennahugo commented 4 years ago

NoDataNoProblem :)

TariqBlecher commented 4 years ago

So I checked and there are ghost antennas in the antenna table (i.e. there are antennae in the antenna table which have no corresponding data rows) So should these antennas be deleted from the antenna table? So is Cubical assigning data rows to the wrong antennas?

bennahugo commented 4 years ago

it shouldn't (antenna index 30 and 31 are flagged) and these correspond to your 2 ghost antennas if I understand you correctly? Nothing to be worried about then - the warning is a red herring and the bug lies elsewhere. It is very likely the interpolation code.

PeterKamphuis commented 4 years ago

In anycase I have split out those antennas. And yes Antenna1 and Antenna2 are integer columns in which the numbers correspond as direct indices to the corresponding antenna rows in the antenna table. That is how it should be if I understand correctly.

PeterKamphuis commented 4 years ago

Ok I retested this with the current master and the issue remains. Updating to the current master rolled back the casacore version python-casacore 3.2.0 --> python-casacore-3.0.0.

The issue remains exactly as indicated before with the difference that for xfer-from with interpolation or without interpolation (chunks corresponding to the solution) the result is now the same and is as indicated in the original post for xfer-from without interpolation. I.e., the large padding seen around the visbilities in the xfer-from with interpolation is no longer present.

bennahugo commented 4 years ago

Thanks for checking Peter - I was hoping the it was related to the 32bit iterator indexer issue in the CC 3.0 for which the work around in the ms_tile reader fixed serious convergence issues on large tiles. To clarify: it should be rolled back to 3.0 - even though it is buggy it is the only version that natively compiles against KERN-3 on Ubuntu 16.04 LTS.

PeterKamphuis commented 4 years ago

Well it does seem that something in the interpolation got fixed in the current master. It now looks like it is just blocks that do not have a solution are set to zero instead of being flagged as they would when actually calibrating. I will see if I can confirm that. I'll also try to transfer the low resolution model to the high resolution and see if the same blocks are solved and if they are all flagged properly in that case, my impression is yes because I really have only seen this when transferring the solutions.

It must be a bit more complicated than that though as there are some differences between using load-from and xfer-from.

PeterKamphuis commented 4 years ago

Ok so this is not a GMRT issue. I ran the carate.sh suite for caracal with minimal docker config on the rawdata.tar. After successfully doing so I took a single data set (1524929477-circinus_p1_corr.ms) and reapplied the first phase only calibration with xfer-from and interpolating to 1 and 1 solutions. I did this in a freshly installed standalone cubical-venv with the current master.

The result is that practically every baseline has 0 bands at the start and beginning in the corrected visibilities. I show an example below (This is baseline 0-22). The black stripes at the start and the end are unflagged 0's.

Baseline0-22

o-smirnov commented 4 years ago

OK, my naive attempts to reproduce this on my own MeerKAT MSs are failing. Clearly, I'm not driving it into the failing edge case.

@PeterKamphuis, could you please point me to a copy of the MS (GMRT, or a post-pipeline version of the circinus one) that I can reproduce this on? Preferably a simple recipe (MS + solutions DB + parset), so that all I need to do is run gocubical to reproduce the failure.

Otherwise, maybe if you (or @gigjozsa) could give me a step-by-step for running carate.sh and reproducing this problem, that'd also be good.

PeterKamphuis commented 4 years ago

@o-smirnov I just emailed you the location of a tarball that you should be able to unpack and the in the directory simply run gocubical Apply_xfer_from.parset. The python in my virtualenv is 3.6.9 and I am running from bash shell.

o-smirnov commented 4 years ago

OK, there are two problems here, and at least one is fixed.

First problem is zeroes in the output. It turns out that missing solutions (i.e. slots that the parameter table machinery was unable to interpolate) were flagged, but those flags weren't propagated into the output properly due to a logic error. This should now be fixed, at least for the case mentioned by @PeterKamphuis in https://github.com/ratt-ru/CubiCal/issues/336#issuecomment-601086600. Please check with branch issue-336.

@TariqBlecher, I hope this also fixes what you report in https://github.com/ratt-ru/CubiCal/issues/336#issuecomment-597575677. I hope you were using --load-from, in which case the failure makes sense, as the same logic error above would have caused flagged solutions (flagged for whatever reason, low-SNR, whatnot) to produce a 0 visibility rather than a flag in the output. Please check. And if you tell me you were using --xfer-from, I'm going to cry, because it's supposed to interpolate over small missing blocks like that.

Second problem is, why was is it unable to interpolate? I'll open a separate issue for this to keep things neat.

TariqBlecher commented 4 years ago

I used neither --load-from nor --xfer-from. I was just solving and output subtracted residuals.

o-smirnov commented 4 years ago

OK that does makes me cry more.

I am getting the same problem when I apply the solutions to the same dataset for which I solved for.

I thought you meant apply mode here, which implies --load-from or --xfer-from.

Can you give an MS/parset illustrating the problem please?

PeterKamphuis commented 4 years ago

So I just tested this on the GMRT set as well and there are no more zero's present with either --load-from, --xfer-from without interpolation or --xfer-from with interpolation (See below) when using the issue-336 branch.

Issue-366 load from: load-from-issue-336

In the left bottom there is a small additional flagging in the highly flagged area. In xfer-from without interpolation this is nicely interpolated over. For the rest xfer-from with no interpolation and load-from look the same.

When interpolating with xfer-from the full bands are around the visibilities. When creating the tables the solutions are split in two in frequency (256 channels, 64 in averaged dataset) and it seems that the outer half of the solution is not interpolated. I guess this should be treated in #357

issue-336 branch xfer-from: xfer-from-issue-336

The new flags are somehow not stored properly though. As I understand it a new cubical run with flagset: -cubical should remove all previously applied cubical flags. However, if I rerun the load-from after the xfer-from run with interpolation the result is the following: loadfrom_after_interpolate_issue-336

I had to remake the calibration tables as both the master and issue-336 threw an error on the old tables. It is this remake that seems to have solved the problem of the gigantic blocks of zeros that were present in the initial test.

--load-from and --xfer-from without interpolation also do not show the bands of flagging at start/end, sides which also immediately explains why they are not present when directly applying the tables after solving.

@o-smirnov I hope that doesn't make you cry even more as it is a bit mixed message. Let me know if you would like the test set.

o-smirnov commented 4 years ago

The outer bands are quite clearly a failure to extrapolate, in either time or frequency. But this:

The new flags are somehow not stored properly though. As I understand it a new cubical run with flagset: -cubical should remove all previously applied cubical flags. However, if I rerun the load-from after the xfer-from run with interpolation the result is the following:

is not explained. So yes please, give me a "care package" with MS+DB+parset.