caracal-pipeline / caracal

Containerized Automated Radio Astronomy Calibration (CARACal) pipeline
GNU General Public License v2.0
29 stars 6 forks source link

interpolate, shminterpolate, intrelopate: interpol ate my phases #1305

Open o-smirnov opened 3 years ago

o-smirnov commented 3 years ago

For reasons lost in the mists of time, our current interp setting for KGB is nearest, nearest, linear, respectively. It took @landmanbester to discover this.

Since nobody could tell me the reason (apart from vague rumours of mirto intoxication), and linear (if not better) throughout seemed like a good idea, I changed it. This led to horrible results (i.e. an undeconvolvable target).

@bennahugo uses nearest, nearest, linear, so perhaps he has access to some ancient CASA wisdom we are not privy to (of course, he still swears by CASA 4.7. It's an unsurpassed vintage, like the 1948 Bordeaux). @IanHeywood uses nearest, linear, linear.

For a good maoschistic laugh, I tried cubic throughout. Much to my surprise, this was better than linear (but not quite as good as nearest).

After a few experiments, I'm back full circle to the Hugo Tradition, as that seems to produce slightly superior results to the Heywood Compromise. I can only conclude that CASA KGB linear interpolation is borken. Help us @JSKenyon, you're our only hope.

KshitijT commented 3 years ago

(apart from vague rumours of mirto intoxication)

They are pretty firm rumours. :P

paoloserra commented 3 years ago

@francescaLoi and I tested KGB nearest,linear,linear (we solve for time-independent K and B, so that's fine). Before selfcal, the image is indeed slightly better than nearest,nearest,linear.

Let me post the actual images soon.

In the interest of future history books, I believe using nearest for G was our setting for applying crosscal to the calibrators themselves, which makes complete sense. I am fairly sure that the rule used to be linear for the target, and that this difference was erased when we switched to the read-only workflow and the new cross-cal logic -- that was major changes indeed, it's possible that this went unnoticed.

o-smirnov commented 3 years ago

Before selfcal, the image is indeed slightly better than nearest,nearest,linear

And my field exhibits the opposite behaviour. Hence I think this needs to be made into a config option (even if the documentation will be a super-helpful "try both ways and see what works better" lol).

o-smirnov commented 3 years ago

I propose adding am optional section called transform: split_field: otfcal: interpolation, with entries for K, B and G. Defaults as discussed above.

paoloserra commented 3 years ago

I agree

paoloserra commented 3 years ago

Here's @francescaLoi's G interp nearest (top) vs linear (bottom). This is before phase selfcal. There are some small differences in the clean masks, but I think the main improvement is due to the change in G interpolation.

Screenshot 2021-01-22 at 14 14 03 Screenshot 2021-01-22 at 14 14 21
landmanbester commented 3 years ago

(we solve for time-independent K and B, so that's fine)

@paoloserra excuse my ignorance but why do you say you solve for time independent K? Isn't delay a function of time?

bennahugo commented 3 years ago

note i do nn in transfer only. i always, however, solve for rate changes as my second selfcal step. Meerkat delays changes by few 10s of ps over minute timescales which I closest I can tel stems from lack of proper earth motion tracking in the correlator fringe stopping and another unidentified error on the long spacings . This is enough to cause problems (~10deg of phase variance) on the long spacings. I've not yet had a field which didn't improve from this type of rate slope calibration.

I do use casa 4.7 to solve as casa 5.x flags delay solutions that are perfectly good when solved in 4.7 with the same his weights and flags.

So for most fields I use a 'p,dp,dd' in vermeerkat selfcal. Hope this helps.

On Fri, 22 Jan 2021, 17:22 Landman Bester, notifications@github.com wrote:

(we solve for time-independent K and B, so that's fine)

@paoloserra https://github.com/paoloserra excuse my ignorance but why do you say you solve for time independent K? Isn't delay a function of time?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/caracal-pipeline/caracal/issues/1305#issuecomment-765480023, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4RE6VDWXNXJ2LS2KVG5J3S3GJ2HANCNFSM4WOGD57Q .

paoloserra commented 3 years ago

(we solve for time-independent K and B, so that's fine)

@paoloserra excuse my ignorance but why do you say you solve for time independent K? Isn't delay a function of time?

I'm sure I'm the ignorant one here, not you.

I think that if the array is properly set up before starting the observation then K should not be a strong function of time. I'm sure it is a function of time to some level, but when doing my spectral-line science I see no evidence that I need a time-dependent K.

For broad-band continuum, where dynamic range is an issue, we solve for time-dependent delays during self-calibration. However, I must say that doing that did not make a huge difference for our images.

EDIT My last statement probably means we've got bigger problems than delay changes in our calibration.

o-smirnov commented 3 years ago

In my experience, delay selfcal always improves the broadband images.

but when doing my spectral-line science I see no evidence that I need a time-dependent K.

I wonder though, is that not a happy accident of operating roughly in the middle of the band? What does phase-only selfcal do in the presence of residual delay errors? I think it will minimize phase error at band centre...

bennahugo commented 3 years ago

I think it is more to do with selecting a nice small portion of bandwidth where a time variable K ~= time variable G

You can see the difference when you start going to > 20dB in dynamic range for compact bright cluster sources / BCGs opposed to mu-Jy/px extended emission

On Fri, Jan 22, 2021 at 7:33 PM Oleg Smirnov notifications@github.com wrote:

In my experience, delay selfcal always improves the broadband images.

but when doing my spectral-line science I see no evidence that I need a time-dependent K.

I wonder though, is that not a happy accident of operating roughly in the middle of the band? What does phase-only selfcal do in the presence of residual delay errors? I think it will minimize phase error at band centre...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/caracal-pipeline/caracal/issues/1305#issuecomment-765572945, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4RE6WYN2NXVTVO5W572N3S3GZGTANCNFSM4WOGD57Q .

--

Benjamin Hugo

PhD. student, Centre for Radio Astronomy Techniques and Technologies Department of Physics and Electronics Rhodes University

Junior software developer Radio Astronomy Research Group South African Radio Astronomy Observatory Black River Business Park Observatory Cape Town

o-smirnov commented 3 years ago

Yeah well of course, if you're selfcaling a narrow band, delay is effectively indistinguishable from phase.

I'm thinking what happens in the wideband case... if you have a phase ramp and you solve for only a phase offset, where is it going to align the phases at to minimize chi-sq across the band? I reckon at the [weighted by amplitude] band centre.

bennahugo commented 3 years ago

That is correct I believe yes.

On Fri, Jan 22, 2021 at 8:05 PM Oleg Smirnov notifications@github.com wrote:

Yeah well of course, if you're selfcaling a narrow band, delay is effectively indistinguishable from phase.

I'm thinking what happens in the wideband case... if you have a phase ramp and you solve for only a phase offset, where is it going to align the phases at to minimize chi-sq across the band? I reckon at the [weighted by amplitude] band centre.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/caracal-pipeline/caracal/issues/1305#issuecomment-765591071, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4RE6QMGOY5GK3X7P7ACMTS3G47VANCNFSM4WOGD57Q .

--

Benjamin Hugo

PhD. student, Centre for Radio Astronomy Techniques and Technologies Department of Physics and Electronics Rhodes University

Junior software developer Radio Astronomy Research Group South African Radio Astronomy Observatory Black River Business Park Observatory Cape Town

IanHeywood commented 3 years ago

One argument against using nearest for G and B is that it will put sharp jumps in the target data at the midpoints in time between solutions. This is the sort of thing that can make an autoflagger take an unwarranted interest in perfectly good data.

o-smirnov commented 3 years ago

Good point. Also, if you then do selfcal with a time interval of >1, such a jump can end up in the middle of an interval, and thus can't be corrected.

paoloserra commented 3 years ago

sharp jumps in the target data

That was my main worry indeed. In most datasets I do phase-only selfcal, so these silly amplitude jumps and errors are never going to be fixed. I believe the example images I posted above are an example of that.

paoloserra commented 3 years ago

I wonder though, is that not a happy accident of operating roughly in the middle of the band? What does phase-only selfcal do in the presence of residual delay errors? I think it will minimize phase error at band centre...

I'm actually working with the full zoom band, which is 32k channels over 107 MHz bandwidth. So I think in my case delay errors do not make much damage not because I'm working in a small, central region of a wider band, but because my bandwidth is small to begin with.

When it comes to our broad-band work, as I said we selfcalibrate time-dependent K.