Open rcyndie opened 7 months ago
Please note I ended using a different MeerKAT ms (meerkat_1t.ms) than from the above. But I can also update the above correspondingly if needed.
The ms contains both clock and TEC effects. Their magnitudes may not be directly comparable but the following input phase plots give an idea of their severity. Please find attached the input clock and TEC plots (referred to as K and T hereafter respectively).
Figure 4a K phase plots
Figure 4b T phase plots
If we were to individually estimate the relevant peaks with the solvers delay_and_offset
and tec_and_offset
, then we obtain the following phase and power spectra:
Figure 5a K Phase plots generated using values from delay_and_offset
Figure 5b Corresponding power spectra
Figure 6a T Phase plots generated using values from tec_and_offset
Figure 6b Corresponding power spectra
Key: The upper and lower x axes represent the TEC and clock (delay) effects' values respectively. The K and T effects are distinguished using the black solid and the blue dotted lines respectively. The green "-." line shows the $\delta=0$ mark and should help directly compare the peak occurrences across different antennas. The arrows in the power spectra plots are of the same height as the peak of the power spectra and placed at the true value of the corresponding effect.
Comment A: From the power spectra plots, both solvers are more efficient in estimating said peaks when the effect itself is significantly pronounced.
Now, if we were to estimate the peaks using a mixed implementation of both solvers, or simply put, a tec_and_delay
solver, and first estimate the dominant peak and "solve'' (or only correct for now), and then estimate for the "then" dominant peak (or sub-dominant peak) and repeat the above, we obtain the following phase plots and power spectra:
Figure 7a First round of initial estimates >> Phase plots generated using the ``gain'' term from QC
Figure 7b Corresponding power spectra plots
Figure 8a Second round of initial estimates using corrected data from 7a >> Phase plots generated using the ``gain'' term from QC
Figure 8b Corresponding power spectra plots
Comment B: Confusing plots in Figures 7a and 7b. When the dominant peak is associated with the clock, the corresponding phase plot is around or at zero. Although I understand the phase plot is a mixed existence of both clock and TEC solved effects, isn't the phase plot supposed to level the playing field, ensuring that both effects result in "comparable" phase values?
Comment C: Even though we have only corrected but not solved for the effects, I was expecting an opposite dominant peak in Figure 8b to Figure 7b; but all the peaks are associated with the clock. Not sure, what to deduce from a negligible phase plot (Figure 8a) but still noticeably sharp peaks in the power spectra?
If we now considered solving (say for 1 round of 10 iterations), and repeating Figures 7a-8b, we can obtain the following: Figure 9a First round of initial estimates >> Phase plots generated using the ``gain'' term from QC
Figure 9b Corresponding power spectra
Figure 10a Second round of initial estimates >> Phase plots generated using the ``gain'' term from QC
Figure 10b Corresponding power spectra
Comment D: Figure 9a and Figure 10a significantly flagged << possibly because the ms has only one observation in time, not enough SNR? Maybe I should repeat the above with some nt observations.
For a better SNR per solution time interval, consider the following MeerKAT ms with 10 timestamps, 1 correlation and 2000 channels (meerkat_10t.ms), and the above plots are repeated with this ms. The following table summarises all the different figures.
Expt on | Figures | Description | delay_and_offset |
tec_and_offset |
tec_and_delay |
---|---|---|---|---|---|
meerkat_1t.ms | 4a | Phase for K True | |||
4b | Phase for T True | ||||
5a | Phase for estimated K | ✔ | |||
5b | Power spectra for estimated K | ✔ | |||
6a | Phase for estimated T | ✔ | |||
6b | Power spectra for estimated T | ✔ | |||
7a | Phase for estimated mixed T | ✔ | |||
7b | Power spectra for estimated mixed T | ✔ | |||
8a | Phase for estimated and solved mixed T | ✔ | |||
8b | Power spectra for estimated and solved mixed T | ✔ | |||
meerkat_10t.ms | 9a | Phase for true K | |||
9b | Phase for true T | ||||
10a | Phase for estimated mixed T | ✔ | |||
10b | Power spectra for estimated mixed T | ✔ | |||
11a | Phase for estimated mixed T | ✔ | |||
11b | Power spectra for estimated mixed T | ✔ | |||
12a | Phase for estimated and solved mixed T | ✔ | |||
12b | Power spectra for estimated and solved mixed T | ✔ | |||
13a | Phase for estimated and solved mixed T | ✔ | |||
13b | Power spectra for estimated and solved mixed T | ✔ |
Figure 9a Phase plots for true K
Figure 9b Phase plots for true T
Figure 10a First round of initial estimates (only estimated)
Figure 10b Corresponding power spectra
Figure 11a Second round of initial estimates (only estimated)
Figure 11b Corresponding power spectra
Figure 12a First round of initial estimates (estimated and solved)
Figure 12b Corresponding power spectra
Figure 13a Second round of initial estimates (estimated and solved)
Figure 13b Corresponding power spectra
Note the solve is done over a single solution interval to ensure a sufficient SNR.
I am using the same ms as in this comment. The DATA is corrupted by both effects K and T. Consider the input effects induced phase plots. Figure 14a Input phase K
Figure 14b Input phase T
Figure 14c Input phase of mixed K and T
Figure 15a Sanity check >> estimated phase plot K by delay_and_offset
on DATA_K (data only corrupted by K)
Figure 15b Power spectra
Figure 15c Estimated values vs true
Figure 16a Sanity check >> estimated phase plot T by tec_and_offset
on DATA_T (data only corrupted by T)
Figure 16b Power spectra
Figure 16c Estimated values vs true
Consider two rounds of estimation via a single (but mixed) delay_and_tec
term, where the dominant peak is estimated for first, and subsequently corrected and try a second round of initial estimates equivalent to the sub-dominant peak.
(delay plots in black, tec plots in blue)
Figure 17a First round of estimates >> dominant peak estimated
Figure 17b Power spectra plots containing the true values of K and T indicated by the arrows, with the lengths matching the power spectra peaks
Figure 17c Power spectra plots (antwise)
Figure 18a Second round of estimates >> sub-dominant peak estimated (problematic)
Figure 18b
Figure 19a Second round of estimates >> sub-dominant peak estimated (maybe less problematic)
Figure 19b
Comment: The previous selection criteria for the dominant peak only accounted for the heights of the peaks, and thus, the dominant peak of the first round remained as the dominant peak in the second round, though at or nearing zero. It might be just a quick fix; the selection criteria also accounts for the magnitude of the peak only in the event the value nears zero with an absolute tolerance of 1e-10. However, the issue is still not fully resolved as the parameters are not necessarily being translated to the gains in the correct way, for example, compare Fig.18a to 19a.
Well it looks like you may be within one phase wrap which may be good enough. I'm a little surprised by figure 18b though. Why would the power spectra all peak at zero? Did you manage to use the nufft for the delay as well?
Well it looks like you may be within one phase wrap which may be good enough. I'm a little surprised by figure 18b though. Why would the power spectra all peak at zero? Did you manage to use the nufft for the delay as well?
Yes, there does not seem to be any significant differences between using the nufft or the traditional fft for the delay estimates. Figure 20a Compare with 19a
I am not sure about Fig. 18b; I guess there is a chance that the selection criteria for the peaks requires further work. In the event Fig. 18b is correct, does it mean the non-zero offset from the mixed term (K and T and offset) translates into the phase plots Fig. 18a?
Ah ok, they are very similar. I though it may help to suppress TEC effects if they somehow got aliased into the delay spectra but maybe its not that surprising that it doesn't do much. We need to find out why the peaks in figure 19b are so close to zero. By how much are you oversampling? Maybe it's just your resolution in delay space that is too coarse. Looking at the residual phases, I only see < 1 phase wraps so they might be good enough already if we have a joint tec and delay solver
Fig. 18b is correct, does it mean the non-zero offset from the mixed term (K and T and offset) translates into the phase plots Fig. 18a?
The offset should just add a constant phase so I don't think so. Residual TEC or delay is not unexpected since the initial estimates are not perfect
Ah ok, they are very similar. I though it may help to suppress TEC effects if they somehow got aliased into the delay spectra but maybe its not that surprising that it doesn't do much. We need to find out why the peaks in figure 19b are so close to zero. By how much are you oversampling? Maybe it's just your resolution in delay space that is too coarse. Looking at the residual phases, I only see < 1 phase wraps so they might be good enough already if we have a joint tec and delay solver
Using the Nyquist frequency, the number of bins for the reconstruction should be 25120, and I exaggerated and used 1050000 for the delay. And the default value seemed to be 34000 from the delay_and_offset
solver. I think I just need to work on the selection criteria for the peaks.
[Branch: v0.2.4-dev]
Below are the results from running the combined gain type delay_and_tec
on the ms, where the
Figure 21a Sanity check << only finding initial estimates on DATA_K (Left: True phases of K (same as in Figure 14a); Right: Recovered phases)
Figure 21b Power spectra Ideally, all of the dominant peaks must be associated with K.
Figure 21c Solving for delay_and_tec
on DATA_K
(Left: True phases of K (same as in Figure 14a);
Right: Recovered phases)
Figure 22a Sanity check << finding initial estimates on DATA_T (Left: True phases of T (same as in Figure 14b); Right: Recovered phases)
Figure 22b Power spectra
Figure 22c Solving for delay_and_tec
on DATA_T
(Left: True phases of T (same as in Figure 14b);
Right: Recovered phases)
Figure 23a Finding initial estimates on DATA (Left: True mixed phases of K and T (same as in Figure 14c); Right: Recovered phases)
Figure 23b Power spectra
Figure 23c Solving for delay_and_tec
on DATA with 130 iterations
(Left: True mixed phases of K and T (same as in Figure 14c);
Right: Recovered phases with 130 iterations)
Since the gains in Figure 23c are severely flagged, the solving is attempted with a larger number of iterations. Figure 23d Compare with 23c (Left: True mixed phases of K and T (same as in Figure 14c); Right: Recovered phases with 1200 iterations)
General comment: Since the initial estimates are enabled for both clock and tec regardless of whichever is dominant, then for example, even when delay is correctly estimated across the antennas, the recovered phases are wrong and seem more inclined to be TEC phases (Figure 21a and 21b). My best guess right now is that since I am allowing both effects to be initialised (and the non-dominant one is not set to zero), even though TEC might be absent, its initialisation despite small (and close to zero), is still significantly larger compared to delay. Therefore, the solutions in Figure 21c are not reliable.
[Branch v0.2.4-dev1] Branching off from here, in this branch, there are two rounds of estimation, where after the first round, partially corrected data is formed using the then updated parameters (and gains). For each round, both initial estimates are enabled and after assigning the params based on the dominant peak, the params for the non-dominant term are set to zero.
Figure 24a Sanity check << only finding initial estimates on DATA_K (Left: True phases of K (same as in Figure 14a); Right: Recovered phases midway)
Figure 24b Sanity check << only finding initial estimates on DATA_K (compare with 21a) (Left: True phases of K (same as in Figure 14a); Right: Recovered phases after second round of estimation)
Figure 24c Power spectra (Left: Estimates from first round; Right: Estimates from second round)
Figure 25a Solving for delay_and_tec
on DATA_K
(As expected, the midway phases are the same as in Figure 24a)
(Left: True phases of K (same as in Figure 14a);
Right: Recovered phases after second round of estimation)
Comment: Figure 24c is unexpected since both power spectra give the same estimates. It is very possible I am not passing in the partially CORRECTED_DATA correctly (check out line and line).
[refer to code] This is also shown in the above comment. Below is an updated set of simulations with now two timestamps. (K << delay; T << TEC)
Figure 26 Product of true K and T
Figure 27a Recovered phases when running delay_and_tec
(roundwise estimations)
(LHS: Midway gains with first set of estimations; RHS: QC gains without solving)
Figure 27b Recovered phases when running delay_and_tec
(roundwise estimations)
(LHS: Midway gains with first set of estimations; RHS: QC gains with solving)
Figure 27c Power spectra plots (LHS: First round of estimates; RHS: Second round)
As seen previously, the power spectra from both rounds seem to be the same.
As Jonathan suggested, also consider the DATA plotted before and after the first set of initial estimates applied. Figure 28a (LHS: Before; RHS: Immediately after)
The RHS of Figure 28a looks less "rampy" (and a bit smoother behaviour in frequency) with the first round of initial estimates applied.
Also find below 3D projections for the before and after scenarios. Figure 28b 3D projections of Figure 28a - Antenna 7
Figure 28c 3D projections of Figure 28a - Antenna 10
While the z axis of the projection plots shows the phase difference (unwrapped) as the height of the surface plots, the colourbar represents the same but through the colourmap with a normalised range.
OK, I am very confused. I cannot tell the difference between the LHS and RHS in 27a. That is a bit surprising - it means that the second round of estimation does nothing.
Figure 27c still appears unchanged between the LHS and RHS. That should not be possible.
Figure 28a makes no sense to me - how to you have the data as a per-antenna quantity?
I wouldn't know where to begin interpreting the 3D plots.
For the data plots, I was thinking more along the lines of plt.imshow(np.angle(data))
before and after the application of the initial correction.
Edit: plt.imshow(np.angle(data), cmap="gist_rainbow", interpolation="none", aspect="auto")
I don't know the exact cmap you used - you can use the same one as above.
Yes, I am not sure how the DATA was loaded within the script, but when I saved it, the dimensions were (ntime*nant, nchan, ncorr), and then I tried splitting the array per antenna to have a better view.
plt.imshow(np.angle(data), cmap="gist_rainbow", interpolation="none", aspect="auto")
(works the same as "hsv" colourmap)
Figure 28d
Yes, the projection plots were an alternative for me to judge the phases, as Figure 28a was not entirely conclusive.
At the very least this is a good matplotlib workout ;-)
I think you should show phase difference plots w.r.t. the true input phases. I can't see any difference in the current plots.
There must be a bug in how you are plotting the power spectra after the first round of initial estimates. The only way they can remain unchanged is if the correction step id doing nothing. Let's have a look at the code in the meeting tomorrow
Yes, just pointed Jonathan to this set of simulations. Will try to update the difference plots soon.
As Jonathan pointed out, I was passing in the param_freq_map
instead of the gain_freq_map
in the compute_corrected_residual() function when forming partially corrected data, and since the two have different frequency resolutions, the power spectra turned out wrong in the second round. Check out the correct power spectra plots as well as the difference in phase plots below.
Before moving on to a DATA corrupted with both delay and TEC, consider one of the sanity checks with running a delay_and_tec
on DATA_K (corrupted with only delay).
Figure | Data | gain_type |
---|---|---|
Figure 29 | DATA_K | delay_and_tec |
Figure 30 | DATA_T | delay_and_tec |
Figure 31 | DATA | delay_and_tec |
Recall Figure 26a is the product of true K and T. Figure 26b True K
Figure 26c True T
Figure 29a Difference between true and recovered phases when running delay_and_tec
on DATA_K(roundwise estimations)
(LHS: Recovered midway gains with first set of estimations; RHS: QC gains with solving)
Figure 29b Power spectra for the first timestamp (LHS: First round of initial estimates; RHS: Second round)
Since the DATA_K is only corrupted by delay, the power spectra on the first round are mostly (if not completely) accurate, and as expected, the peaks in the second round of estimation corresponding to a zero lag.
Figure 30a Difference between true and recovered phases when running delay_and_tec
on DATA_T(roundwise estimations)
(LHS: Recovered midway gains with first set of estimations; RHS: QC gains with solving)
Figure 30b Power spectra for the first timestamp (LHS: First round of initial estimates; RHS: Second round)
Figure 31a Difference between true and recovered phases when running delay_and_tec
on DATA (roundwise estimations)
(LHS: Midway gains with first set of estimations; RHS: QC gains with solving)
Figure 31b Power spectra plots (LHS: First round of initial estimates; RHS: Second round)
All the colourbars are set to $[-\pi, \pi]$ in this comment, but I can also provide the zoomed in phases for the difference plots if needed.
We are making use of a MeerKAT ms with one timestamp, 1 correlation and 2000 channels. The idea is to start with the ms containing unity model visibilities, corrupt it with the desired effects accordingly and solve using the
delay_and_offset
andtec_and_offset
solvers (hereafter referred to as solverK and solverT respectively). The simulations are summarised as follows:Note: We are currently not doing any solver and focussing on the initial estimates and we are using the same K from Expt I and same T from Expt II in Expt III. And here some results from the experiments.
Figure 0a Phase plots from input K
Figure 0b Phase plots from input T
Figure 1a Phase plots from Expt I: solverK
Figure 1b Phase plots from Expt I: solverT
Figure 1c Power spectrum plots from Expt I: solverK
Figure 1d Power spectrum plots from Expt I: solverT
Figure 1e Comparing the estimated initial estimates against the true values: K
Figure 2a Power spectrum plots from Expt II: solverK
Figure 2b Power spectrum plots from Expt II: solverT
Figure 2c Comparing the estimated initial estimates against the true values: T
Figure 3a Power spectrum plots from Expt III: solverK
Figure 3b Power spectrum plots from Expt III: solverT
Figure 3c Comparing the estimated initial estimates against the true values: K and T
As far as expts I and II are concerned, the input clock and TEC errors can be recovered with the solverK and solverT respectively. However, when both effects are present in Expt III, it is difficult to determine the nature of the (two) effects present in the data based on Figure 3c alone. To try and resolve these peaks, we can try approaching thw problem using the following method:
delay_and_offset
andtec_and_offset
solvers.