Closed isullivan closed 3 years ago
I confirm that the bug I mentioned in #243 does not happen on this branch.
Here is the difference power spectrum for master
at a2dec7
and the PR at 4c5f58
. The sense of subtraction is master
minus cal_fix_and_speedup
.
Does this include the "Revert to orig sign convention in exp and match" commit from 3 days ago?
Yes.
Okay, cool. So I assume you ran with some form of pre-Wenyang calibration, in which case the cable reflection was better accounted for in this PR. Would it be possible for you to run with auto-ratio
calibration for both as well (just to remove the contribution of the cable reflection)?
Also, can you post cal structures and/or plots on the differences of the convergence levels and overall "raw" gains (i.e. per-freq gains directly from the subroutine prior to fitting). Also, what was the time/mem results for each test?
Yes, pre-Wenyang cal. Will get to work on the other plots/metrics.
The timing/memory for 10 cores of an r.8xlarge instance on AWS is as follows
master: 63G RAM, 223 min PR: 63G RAM, 180 min
I think there is an underflow issue cropping up with the cal convergences which is making it seem like 3/4 of them are identically 0. It does not match the flags read in by a UVCal object, and the cal plots for the obs look too sane for on average 3/4 of the band per antenna being flagged. Will post plots representing this thinking on request.
I've tested this is in both simulation and data. They both tell a very similar story, so I'll show just simulation for simplicity.
Simulation specifics: Input visibilities have ~50,000 sources and a smooth beam, and the model has far less sources and the fft of the instrumental beam. The simulation is allowed to fit a calibration amp and phase per frequency (the classic overfitting scenario).
There are three types of tests here:
In order to test the original least squares fitting (1), I've just reverted the subroutine. Thus, I really am only comparing the fitting technique. The updated branch with original linear least squares fitting (2) is not quite the same as (1) because there is a smarter calculation of the threshold criteria.
First, a test on how many iterations till the threshold criteria is reached for all tiles and for one frequency channel. Black lines are each tile's convergence in that iteration. Green is the mean over all tiles. The first 4 iterations fit for just the phase, hence the discontinuity.
Second, a test on what the converged result was for both amp and phase. The changes are too small to see by eye, so I am comparing the differences. Black is each tile's difference, and green is the standard deviation over all tiles.
Third, the power spectra difference.
One final note: The convergence vs iteration test I showed above is just for one frequency. In fact, not all frequencies converge. The number of converged frequencies is different between each test. Again, this is simulation with no flagged frequencies.
Therefore, it seems as though the new convergence threshold calculation in (2) is a win, and that the adaptive gain fitting technique (3) is promising but may need tweaking. I will look into the failed frequencies to see if I can't pinpoint why.
This is fascinating! Thanks for looking into it so thoroughly @nicholebarry!
This is a great analysis @nicholebarry! A few comments: The initial blip around iteration 5 is when we switch from only fitting the phase to the full gain calculation. I'm very happy to see how the new threshold is working, that convergence plot looks great! Since these are simulations, I assume the large differences in the final amplitude and phase plots for (3) around 185 and 195 MHz are unphysical, right? I am confused about the power spectrum improvement in this case. Are these the frequencies that failed to converge, maybe? I suspect the oscillations in the convergence plot are prematurely triggering the end of calibration, which suggests that the maximum gain is allowed to be too aggressive. I also suspect that these oscillations may be the reason (3) has fewer converging frequencies, since there may be a jump up that exits calibration just above the threshold used to indicate a converging or diverging solution.
Three issues:
The base gain is defaulted to 1 in the calibration init procedure. The adaptive gain procedure then checks to see if base gain is set, and if not, it is set to 0.5. This defaulting in the adaptive gain procedure will never happen due to the prior default in the init. Rerunning tests with the base gain set to 0.5 increases speed, but also increases the number of diverged frequency channels. Here is an example frequency convergence using a base gain of 0.5. And of an example of frequency convergence using a base gain of 0.75 (in other words, slower but steadier convergence). For this frequency, it actually converged faster than the base gain of 0.5, but if you look closely, it has a slightly flatter slope.
Convergence checking begins where IF i GT phase_fit_iter THEN BEGIN
where i
is the iteration and phase_fit_iter
is the number of iterations of fitting just the phase. I suggest changing this to IF i GT phase_fit_iter+divergence_history THEN BEGIN
where divergence_history
is number of prior iterations to compare to in order to see if the current iteration is diverging. Since there is an expected discontinuity in the convergence once amplitudes are included, some of the divergence calculations are being skewed. This solves approx 2 to 4 diverged frequency channels which failed converging at 1e-2 when it should have reached 1e-7.
Adding an additional logic condition reduces a lot of the "diverged channels". In reality, hitting a threshold of 1e-7 (default) is extremely hard, as showcased is the original convergence plot (first plot above) which never truly converged. We are converging (for the most part) with the update because there is a hard condition (max deviation) and a soft condition (mean deviation) and if we start to diverge, we check to see if the soft condition has been met and call it good. However, we don't check to see if the previous iteration, the one before the start of divergence, met the soft condition if the current iteration has not. I've added logic to check that and back up the iteration if it has.
Incorporating these changes, here are the number of converged channels and mean iterations till convergence for:
Given this, I'm pushing these changes in addition to defaulting the base gain to 0.75.
Combining commits from three branches. From @bhazelton: One of them addresses Mike’s issue above (edge_source_fix), one fixes a cable reflection fitting bug (cable_reflection_fix) and one implements Ian’s new calibration speedup (calibration_speedup).