rcyndie / QuartiCal

CubiCal, but with greater power.
MIT License
0 stars 0 forks source link

Attempting to separate clock and TEC errors #4

Open rcyndie opened 2 months ago

rcyndie commented 2 months ago

We are making use of a MeerKAT ms with one timestamp, 1 correlation and 2000 channels. The idea is to start with the ms containing unity model visibilities, corrupt it with the desired effects accordingly and solve using the delay_and_offset and tec_and_offset solvers (hereafter referred to as solverK and solverT respectively). The simulations are summarised as follows:

Expt Time variant Clock Errors (K) TEC Errors (T)
I
II
III

Note: We are currently not doing any solver and focussing on the initial estimates and we are using the same K from Expt I and same T from Expt II in Expt III. And here some results from the experiments.

Figure 0a Phase plots from input K image

Figure 0b Phase plots from input T image

Figure 1a Phase plots from Expt I: solverK image

Figure 1b Phase plots from Expt I: solverT image

Figure 1c Power spectrum plots from Expt I: solverK image

Figure 1d Power spectrum plots from Expt I: solverT image

Figure 1e Comparing the estimated initial estimates against the true values: K image

Figure 2a Power spectrum plots from Expt II: solverK image

Figure 2b Power spectrum plots from Expt II: solverT image

Figure 2c Comparing the estimated initial estimates against the true values: T image

Figure 3a Power spectrum plots from Expt III: solverK image

Figure 3b Power spectrum plots from Expt III: solverT image

Figure 3c Comparing the estimated initial estimates against the true values: K and T image

As far as expts I and II are concerned, the input clock and TEC errors can be recovered with the solverK and solverT respectively. However, when both effects are present in Expt III, it is difficult to determine the nature of the (two) effects present in the data based on Figure 3c alone. To try and resolve these peaks, we can try approaching thw problem using the following method:

  1. First estimate both peaks using the delay_and_offset and tec_and_offset solvers.
  2. Correct the data using solutions for the dominant peak (per antenna).
  3. Continue correcting the corrected data from 2 using solutions for the sub-dominant peak (per antenna).
rcyndie commented 2 weeks ago

Please note I ended using a different MeerKAT ms (meerkat_1t.ms) than from the above. But I can also update the above correspondingly if needed.

The ms contains both clock and TEC effects. Their magnitudes may not be directly comparable but the following input phase plots give an idea of their severity. Please find attached the input clock and TEC plots (referred to as K and T hereafter respectively).

Figure 4a K phase plots image

Figure 4b T phase plots image

If we were to individually estimate the relevant peaks with the solvers delay_and_offset and tec_and_offset, then we obtain the following phase and power spectra: Figure 5a K Phase plots generated using values from delay_and_offset image

Figure 5b Corresponding power spectra image

Figure 6a T Phase plots generated using values from tec_and_offset image

Figure 6b Corresponding power spectra image

Key: The upper and lower x axes represent the TEC and clock (delay) effects' values respectively. The K and T effects are distinguished using the black solid and the blue dotted lines respectively. The green "-." line shows the $\delta=0$ mark and should help directly compare the peak occurrences across different antennas. The arrows in the power spectra plots are of the same height as the peak of the power spectra and placed at the true value of the corresponding effect.

Comment A: From the power spectra plots, both solvers are more efficient in estimating said peaks when the effect itself is significantly pronounced.

Now, if we were to estimate the peaks using a mixed implementation of both solvers, or simply put, a tec_and_delay solver, and first estimate the dominant peak and "solve'' (or only correct for now), and then estimate for the "then" dominant peak (or sub-dominant peak) and repeat the above, we obtain the following phase plots and power spectra: Figure 7a First round of initial estimates >> Phase plots generated using the ``gain'' term from QC image

Figure 7b Corresponding power spectra plots image

Figure 8a Second round of initial estimates using corrected data from 7a >> Phase plots generated using the ``gain'' term from QC image

Figure 8b Corresponding power spectra plots image

Comment B: Confusing plots in Figures 7a and 7b. When the dominant peak is associated with the clock, the corresponding phase plot is around or at zero. Although I understand the phase plot is a mixed existence of both clock and TEC solved effects, isn't the phase plot supposed to level the playing field, ensuring that both effects result in "comparable" phase values?

Comment C: Even though we have only corrected but not solved for the effects, I was expecting an opposite dominant peak in Figure 8b to Figure 7b; but all the peaks are associated with the clock. Not sure, what to deduce from a negligible phase plot (Figure 8a) but still noticeably sharp peaks in the power spectra?

If we now considered solving (say for 1 round of 10 iterations), and repeating Figures 7a-8b, we can obtain the following: Figure 9a First round of initial estimates >> Phase plots generated using the ``gain'' term from QC image

Figure 9b Corresponding power spectra image

Figure 10a Second round of initial estimates >> Phase plots generated using the ``gain'' term from QC image

Figure 10b Corresponding power spectra image

Comment D: Figure 9a and Figure 10a significantly flagged << possibly because the ms has only one observation in time, not enough SNR? Maybe I should repeat the above with some nt observations.

rcyndie commented 1 week ago

For a better SNR per solution time interval, consider the following MeerKAT ms with 10 timestamps, 1 correlation and 2000 channels (meerkat_10t.ms), and the above plots are repeated with this ms. The following table summarises all the different figures.

Expt on Figures Description delay_and_offset tec_and_offset tec_and_delay
meerkat_1t.ms 4a   Phase for K True      
  4b Phase for T True      
  5a Phase for estimated K    
  5b Power spectra for estimated K    
  6a Phase for estimated T    
  6b Power spectra for estimated T    
  7a Phase for estimated mixed T    
  7b Power spectra for estimated mixed T    
  8a Phase for estimated and solved mixed T    
  8b Power spectra for estimated and solved mixed T    
meerkat_10t.ms 9a Phase for true K    
  9b Phase for true T    
  10a Phase for estimated mixed T    
  10b Power spectra for estimated mixed T    
  11a Phase for estimated mixed T    
  11b Power spectra for estimated mixed T    
  12a Phase for estimated and solved mixed T    
  12b Power spectra for estimated and solved mixed T    
  13a Phase for estimated and solved mixed T    
  13b Power spectra for estimated and solved mixed T    

Figure 9a Phase plots for true K image

Figure 9b Phase plots for true T image

Figure 10a First round of initial estimates (only estimated) image

Figure 10b Corresponding power spectra image

Figure 11a Second round of initial estimates (only estimated) image

Figure 11b Corresponding power spectra image

Figure 12a First round of initial estimates (estimated and solved) image

Figure 12b Corresponding power spectra image

Figure 13a Second round of initial estimates (estimated and solved) image

Figure 13b Corresponding power spectra image

Note the solve is done over a single solution interval to ensure a sufficient SNR.

rcyndie commented 4 days ago

I am using the same ms as in this comment. The DATA is corrupted by both effects K and T. Consider the input effects induced phase plots. Figure 14a Input phase K image

Figure 14b Input phase T image

Figure 14c Input phase of mixed K and T image

Figure 15a Sanity check >> estimated phase plot K by delay_and_offset on DATA_K (data only corrupted by K) image

Figure 15b Power spectra image

Figure 15c Estimated values vs true image

Figure 16a Sanity check >> estimated phase plot T by tec_and_offset on DATA_T (data only corrupted by T) image

Figure 16b Power spectra image

Figure 16c Estimated values vs true image

rcyndie commented 4 days ago

Consider two rounds of estimation via a single (but mixed) delay_and_tec term, where the dominant peak is estimated for first, and subsequently corrected and try a second round of initial estimates equivalent to the sub-dominant peak. (delay plots in black, tec plots in blue) Figure 17a First round of estimates >> dominant peak estimated image

Figure 17b Power spectra plots containing the true values of K and T indicated by the arrows, with the lengths matching the power spectra peaks image

Figure 17c Power spectra plots (antwise) image

Figure 18a Second round of estimates >> sub-dominant peak estimated (problematic) image

Figure 18b image

Figure 19a Second round of estimates >> sub-dominant peak estimated (maybe less problematic) image

Figure 19b image

Comment: The previous selection criteria for the dominant peak only accounted for the heights of the peaks, and thus, the dominant peak of the first round remained as the dominant peak in the second round, though at or nearing zero. It might be just a quick fix; the selection criteria also accounts for the magnitude of the peak only in the event the value nears zero with an absolute tolerance of 1e-10. However, the issue is still not fully resolved as the parameters are not necessarily being translated to the gains in the correct way, for example, compare Fig.18a to 19a.

landmanbester commented 4 days ago

Well it looks like you may be within one phase wrap which may be good enough. I'm a little surprised by figure 18b though. Why would the power spectra all peak at zero? Did you manage to use the nufft for the delay as well?

rcyndie commented 4 days ago

Well it looks like you may be within one phase wrap which may be good enough. I'm a little surprised by figure 18b though. Why would the power spectra all peak at zero? Did you manage to use the nufft for the delay as well?

Yes, there does not seem to be any significant differences between using the nufft or the traditional fft for the delay estimates. Figure 20a Compare with 19a image

I am not sure about Fig. 18b; I guess there is a chance that the selection criteria for the peaks requires further work. In the event Fig. 18b is correct, does it mean the non-zero offset from the mixed term (K and T and offset) translates into the phase plots Fig. 18a?

landmanbester commented 3 days ago

Ah ok, they are very similar. I though it may help to suppress TEC effects if they somehow got aliased into the delay spectra but maybe its not that surprising that it doesn't do much. We need to find out why the peaks in figure 19b are so close to zero. By how much are you oversampling? Maybe it's just your resolution in delay space that is too coarse. Looking at the residual phases, I only see < 1 phase wraps so they might be good enough already if we have a joint tec and delay solver

landmanbester commented 3 days ago

Fig. 18b is correct, does it mean the non-zero offset from the mixed term (K and T and offset) translates into the phase plots Fig. 18a?

The offset should just add a constant phase so I don't think so. Residual TEC or delay is not unexpected since the initial estimates are not perfect

rcyndie commented 3 days ago

Ah ok, they are very similar. I though it may help to suppress TEC effects if they somehow got aliased into the delay spectra but maybe its not that surprising that it doesn't do much. We need to find out why the peaks in figure 19b are so close to zero. By how much are you oversampling? Maybe it's just your resolution in delay space that is too coarse. Looking at the residual phases, I only see < 1 phase wraps so they might be good enough already if we have a joint tec and delay solver

Using the Nyquist frequency, the number of bins for the reconstruction should be 25120, and I exaggerated and used 1050000 for the delay. And the default value seemed to be 34000 from the delay_and_offset solver. I think I just need to work on the selection criteria for the peaks.