BDA: first impressions - Githubissues

o-smirnov commented 4 years ago

So I'm trying this on some real MeerKAT data. I've already done a sanity check with a calibrator (point source at centre), and it seemed to give sensible results. Time for a real field:

xova bda --decorrelation 0.98 ../msdir/1557766852_sdp_l0_1284.full_pol-CGCG044_046-corr1.ms --dc DATA --force -o avg-044-098.ms
xova bda --decorrelation 0.99 ../msdir/1557766852_sdp_l0_1284.full_pol-CGCG044_046-corr1.ms --dc DATA --force -o avg-044-099.ms

Only the 98% case tested so far. Data reduction is 455G -> 2.5G.

DDFacet got to work with it with minimal fuss (made some very minor fixes in the I/O layer). WSClean falls over on the ragged SPECTRAL_WINDOW table structure, gonna be in Andre's ear about that or maybe will just try to patch.

The dirties look comparable:

PSFs look virtually identical:

[x] Sadly, DDFacet HMP clean fell over at first go with an SVD error (on the BDA MS). Ran to completion with the non-BDA MS, so that's the first difference noted.
[ ] Will repeat this at 99% decorr BDA
[ ] Will repeat this with plain Hogbom clean

Well, it would be very suspicious indeed if everything worked on the first go! Overall, great job @sjperkins, it seems to be doing the right thing in principle -- now it's up to @smasoka to explore the subtle effects.

sjperkins commented 4 years ago

Thanks for doing the initial verification on this @o-smirnov.

The sources in the dirties seem more prominent to me:

dirties

Hopefully the deconvolution improves things with the workaround for #14.

o-smirnov commented 4 years ago

Too prominent perhaps!

The workaround helped. --decorrelation 0.99 --min-nchan 16 gives me a 4.3G MS, and deconvolution runs through.

Sadly, there's an interesting artefact. All the sources have had their hair catch fire:

BDA	no BDA

The bright "afterburner" artefacts all seem to be radial, in a direction away from centre.

Time to break out the trusty "point source off centre" simulation!

bennahugo commented 4 years ago

It looks like there are large phase errors (negative holes) on the point like sources as well. Is this indicating excessive smearing on the long spacings? Can you try running fixvis in CASA to recompute the UVW coordinates. I have my suspicions about the simple geometric averaging being applied in xova.

On Mon, Aug 3, 2020 at 11:07 AM Oleg Smirnov notifications@github.com wrote:

Too prominent perhaps!

The workaround helped. --decorrelation 0.99 --min-nchan 16 gives me a 4.3G MS, and deconvolution runs through.

[image: image] https://user-images.githubusercontent.com/6470079/89165290-a2a90880-d578-11ea-8321-af50b39b1dd6.png

Sadly, there's an interesting artefact. All the sources have had their hair catch fire: [image: image] https://user-images.githubusercontent.com/6470079/89165622-1cd98d00-d579-11ea-9ed2-aed1fe6cb596.png [image: image] https://user-images.githubusercontent.com/6470079/89165647-2662f500-d579-11ea-8a3b-10d513bd5b9c.png

The bright "afterburner" artefacts all seem to be radial, in a direction away from centre.

Time to break out the trusty "point source off centre" simulation!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ska-sa/xova/issues/13#issuecomment-667904899, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4RE6XBA5ECVIZGDWWTDQDR6Z44RANCNFSM4PQFNKFQ .

--

Benjamin Hugo

PhD. student, Centre for Radio Astronomy Techniques and Technologies Department of Physics and Electronics Rhodes University

Junior software developer Radio Astronomy Research Group South African Radio Astronomy Observatory Black River Business Park Observatory Cape Town

o-smirnov commented 4 years ago

The holes on the right? There's no BDA in the image on the right.

bennahugo commented 4 years ago

ok then the data is poorly calibrated to start with I guess. Could you still check the uvw coordinates large averaging intervals on the short spacings may still be causing heavoc with the coordinates.

I do however think this calls for simulation to take calibration errors out of the equation.

On Mon, Aug 3, 2020 at 11:28 AM Oleg Smirnov notifications@github.com wrote:

The holes on the right? There's no BDA in the image on the right.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ska-sa/xova/issues/13#issuecomment-667915467, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4RE6QJ7ZULT3C5JHOGUB3R6Z7LLANCNFSM4PQFNKFQ .

--

Benjamin Hugo

PhD. student, Centre for Radio Astronomy Techniques and Technologies Department of Physics and Electronics Rhodes University

Junior software developer Radio Astronomy Research Group South African Radio Astronomy Observatory Black River Business Park Observatory Cape Town

sjperkins commented 4 years ago

Is that time or frequency smearing?

I wonder about the following code for recalculating the frequencies.

It calculates the

start and end of the band
Adds/subtracts half the new channel width from the above
Produces a np.linspace from the above two values

        start = chan_freqs[spw][0] - chan_widths[spw][0] / 2
        end = chan_freqs[spw][-1] + chan_widths[spw][-1] / 2

        bandwidth = chan_widths[spw].sum()  # Maybe TOTAL_BANDWIDTH?
        cw = np.full(nchan, bandwidth / nchan)
        cf = np.linspace(start - cw[0] / 2,
                         end + cw[-1] / 2,
                         nchan)

Note the code doesn't handle negative channel width's/frequencies yet and will fall over if it detects them

o-smirnov commented 4 years ago

It's quite possible the data is miscalibrated. I said "hey, shiny", and grabbed any old dataset. This was just a sanity check, remember.

I completely agree, now is the time to "break out the trusty 'point source off centre' simulation!" and not come back to real data until we understand what that does.

I'll do a quick check with fixvis, just because it's easy.

sjperkins commented 4 years ago

Is that time or frequency smearing?

I wonder about the following code for recalculating the frequencies.

It calculates the
1. start and end of the band

2. Adds/subtracts half the new channel width from the above

3. Produces a np.linspace from the above two values
        start = chan_freqs[spw][0] - chan_widths[spw][0] / 2
        end = chan_freqs[spw][-1] + chan_widths[spw][-1] / 2

        bandwidth = chan_widths[spw].sum()  # Maybe TOTAL_BANDWIDTH?
        cw = np.full(nchan, bandwidth / nchan)
        cf = np.linspace(start - cw[0] / 2,
                         end + cw[-1] / 2,
                         nchan)
Note the code doesn't handle negative channel width's/frequencies yet and will fall over if it detects them

Just pen and papered this. It produces the right results if the frequencies and channel widths are layed out correctly, by which I mean

CHAN_WIDTH values are all equal
CHAN_FREQ lie exactly in the midpoint of their respective CHAN_WIDTH

I suppose there can be exotic channelisations out there but from past conversations they're as likely as a Yeti?

o-smirnov commented 4 years ago

CHAN_WIDTH is effective channel width, strictly speaking. So if something like Hanning tapering was applied, it might not be equal to total bandwidth / nchan. I think I would just use TOTAL_BANDWIDTH...

o-smirnov commented 4 years ago

P.S. @bennahugo nice try, but fixvis didn't do much (if anything). @sjperkins: this is clearly frequency-related (radial artefacts almost always are). Simulations time!

IanHeywood commented 4 years ago

if something like Hanning tapering was applied

Note that if you're doing Hanning smoothing with CASA then it won't adjust the main table of the MS to reflect it. AFAIK the correct thing to do is to adjust CHAN_WIDTH and drop every other channel (the added bonus being a halving of the data volume). CASA claims that the channel weights are adjusted instead, but given its track record on that front I'd be wary.

sjperkins commented 4 years ago

Quoting from MSV2.0 SPECTRAL_WINDOW, it looks like CHAN_WIDTH and EFFECTIVE_BW/RESOLUTION are respectively the nominal and effective bandwidths?

CHAN_FREQ Center frequencies for each channel in the data matrix. These can be frequency-dependent, to accommodate instruments such as acousto-optical spectrometers. Note that the channel frequencies may be in ascending or descending frequency order.

CHAN_WIDTH Nomical channel width of each spectral channel. Although these can be derived from CHAN_FREQ by differencing, it is more efficient to keep a separate reference to this information.

MEAS_FREQ_REF Frequency Measure reference for CHAN_FREQ. This allows a row-based reference for this column in order to optimize the choice of Measure reference when Doppler tracking is used. Modified only by the MS access code.

EFFECTIVE_BW The effective noise bandwidth of each spectral channel.

RESOLUTION The effective spectral resolution of each channel.

It sounds like practice may deviate from the definition in regard to CHAN_WIDTH!

o-smirnov commented 4 years ago

I'd still just use TOTAL_BANDWIDTH then. But it will give the same results as your code, so this is not the issue. Anyway, let's start a separate issue for channel/bandwidth definitions before we hijack this one completely.

sjperkins commented 4 years ago

OK, the bulk of the "figuring out the number of channelisations per bin" code is here:

https://github.com/ska-sa/codex-africanus/pull/173/files#diff-84a51c171d31f55dbfc51aa18aa15f3aR214-R254

What may be suspicious is working backwards from the fractional bandwidth to the maximum change in bandwidth before decorrelation in frequency occurs:

https://github.com/ska-sa/codex-africanus/pull/173/files#diff-84a51c171d31f55dbfc51aa18aa15f3aR68-R83

As fractional bandwidth was used in the PSF paper by @atemkeng, I went with the technical definition, but perhaps I could just multiply fractional_bandwidth by total_bandwidth?

Also, l0 in the paper was the position of the off-centre source and technically defined in terms of an l and m coordinate. I think this is related to lm_max in DDFacet code.

I've just put together a small change that converts l+m to lm_max https://github.com/ska-sa/codex-africanus/pull/213/files. Haven't merged, just for perusals sake.

Those are the potential niggle areas that I can think of.

ratt-ru / xova

BDA: first impressions #13

--

--