Closed paoloserra closed 11 months ago
Are you running through cubical or meerkathi? I don't even understand why a --g-xfer-from works for you when it crashes for me.
Also, your --g-freq-int 0 means all frequencies are calibrated as a single gain but doesn't that collide with the default data-freq-chunk?
Are you running through cubical or meerkathi? I don't even understand why a --g-xfer-from works for you when it crashes for me.
Within meerkathi. Does it mean you've never been able to run transfer_apply_gains ?
Also, your --g-freq-int 0 means all frequencies are calibrated as a single gain but doesn't that collide with the default data-freq-chunk?
Yes, I've solved for freq-independent gains. What is the default data-freq-chunk? Maybe cubical can handle conflicting settings there?
BTW calibrating with a shorter --g-time-int reduced the padding along the time axis in the fine channel dataset. So I suspect that those zeros are the first and last solution interval, although I'm not sure why the visibilities should be set to zero there.
@PeterKamphuis
$ stimela cabs -i cubical
...
Name data-freq-chunk
Description Chunk data by this number of channels. See time-chunk for info. 0 means full frequency axis.
Type int
Default None
Within meerkathi. Does it mean you've never been able to run transfer_apply_gains ?
No I haven't but then I have 512 channels typically. With the latest stimela it is not working, that is actually what is currently holding up the merge of the branch Two_Scale_Clean with the overhaul of Jones matrices.
Yes, I've solved for freq-independent gains. What is the default data-freq-chunk? Maybe cubical can handle conflicting settings there?
Well is None == 0? Because the cubical docs say
--data-freq-chunk=string
| Data will be cut up into blocks containing this many channels. This limits the amount of data >processed at once. Smaller chunks allow for a smaller RAM footprint and greater parallelism, but >this sets an upper limit on the solution intervals that may be employed. Specify as an integer >number of channels, or a value with a unit (e.g. ‘128MHz’). 0 means use full frequency axis. >Default: 32.
In my branch I set the freq-chunk explicitly and I'm looking at very blocked corrected data column. But clearly from your images that issue is not in your version of cubical, I'm looking in to it but for now I'm just mostly confused.
@PeterKamphuis judging from the cubical log messages you are right, the default frequency chunk is 32.
In case that makes you feel better, my last run of transfer_apply_gain gave me an error. No clue why since I have not changed the way I'm running it... (all good, not sure what I had done wrong but I've rerun this multiple times and always successfully; so it would still be good to understand why xfer is not working for you @PeterKamphuis )
@PeterKamphuis I last pulled and built Stimela master (1.2.3) on 27 Sep. As far as I can see there's been no changes in Stimela's cubical since then.
@paoloserra Apparently I have an issue there then cause apparently I am running version 1.2.2 installed on 11 Sep. Maybe I am running the old version? Let me update everything.
@paoloserra @SpheMakh on github the latest release is 1.2.2 so 1.2.3 is not the master? Are you running python 2 or 3 BTW cause I'm running 3.6. Does that matter for stimela?
@paoloserra @SpheMakh on github the latest release is 1.2.2 so 1.2.3 is not the master?
1.2.3 is the master version (it will be the next release). But I did not update cubical in that version
@SpheMakh so I need to do pip install and then pull and build to get that right?
@SpheMakh so I need to do pip install and then pull and build to get that right?
Yes. But pip install from github
If I'm not mistaken the gain-transfer patch is not yet in any cubical release: https://github.com/ratt-ru/CubiCal/pull/316. If you can confirm that this works for you it can be merged into the next release.
@bennahugo yes but for @paoloserra it works so first figuring out why and whether this is a still existing problem. That patch does not work as it should as it does not seem to apply the gains.
Can you elaborate? I think what you are seeing above could be a mismatch in gain settings between the two rounds of calibration. Are you sure that your gain solutions (and chunks) are the same between the solve and apply steps?
@bennahugo In my case the xfer-from just does not work, i.e. stimela crashes as explained in the Issue https://github.com/ratt-ru/CubiCal/issues/315. I am sure that I have used the same settings. For @paoloserra this does not seem to happen and I am trying to figure out why. I do not know whether @paoloserra has explicitly set all chunks and intervals to be the same.
I'm looking into that now. I think the meerkathi way of setting up cubical can be improved. I'll soon make a PR with some proposed changes.
@paoloserra yes I have done that but waiting to solve this issue before merging that branch.
That's OK, but I may make some minimal changes anyway just to allow sensible settings. Will ask for your review.
So am I correct in thinking that time-chunk should be a multiple of time-int? And that time-int = 0 is the same as setting time-int = time-chunk?
(And the same for the frequency axis.)
That is my understanding as well.
Alright I will have a look at the transfer code in cubical - my newest slope + phase selfcal chaining also use it because you cannot solve for a parameterized solution and a normal gain in the same chain unfortunately. As for the time tile size - ideally it is a multiple of the interval yes, ultimately the tile bounds the solution interval. However be careful (and consistent) with your max-chunks and min-chunks settings. Cubical by default is greedy and will adjust the tiles based on the size of the process pool, so your chunk size could be different.
So am I correct in thinking that time-chunk should be a multiple of time-int? And that time-int = 0 is the same as setting time-int = time-chunk?
(And the same for the frequency axis.)
Yes. @o-smirnov can you confirm
I confirm.
Cheers, Oleg
Sent from my phone. Quality of spelling inversely proportional to finger size.
On Tue, 22 Oct 2019, 21:23 Sphesihle Makhathini, notifications@github.com wrote:
So am I correct in thinking that time-chunk should be a multiple of time-int? And that time-int = 0 is the same as setting time-int = time-chunk?
(And the same for the frequency axis.)
Yes. @o-smirnov https://github.com/o-smirnov can you confirm
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ska-sa/meerkathi/issues/627?email_source=notifications&email_token=ABRLTP2OAHRKY7NM7EWOITTQP5HKLA5CNFSM4JBME3BKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEB64VFQ#issuecomment-545114774, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABRLTP6EYXJOZZZ7PNCUVK3QP5HKLANCNFSM4JBME3BA .
Following https://github.com/ska-sa/meerkathi/pull/636 I have self-calibrated a 95 fat-channels .MS with;
time-chunk ........................................ = 128
freq-chunk ........................................ = 0
time-int .......................................... = 16
freq-int .......................................... = 0
So I solved for frequency-independent gains every 2m8s, with 8 independent solutions per time chunk.
I have then successfully transferred and applied the gains to a 478 fine-channels .MS with
time-chunk ........................................ = 16
freq-chunk ........................................ = 0
time-int .......................................... = 1
freq-int .......................................... = 1
where I reduced time-chunk from 128 to 16 not to run out of memory.
I've noticed that for the cubical gain-transfer run, time-int and freq-int are not copied from the calibration settings, but are left unset in MeerKATHI and take a value of 1. See https://github.com/ska-sa/meerkathi/blob/eeb5a491fc0bdf607be2fef7bc74b89264a7ae4e/meerkathi/workers/self_cal_worker.py#L1084 and following lines. I would have thought that time-int = 16 and freq-int = 0 (same as in the calibration run) would have made more sense. Can anybody comment?
Anyway, I no longer see the zero-ed edges I reported above. So that's good news. Maybe my problem happened because of a mismatch as suggested by @bennahugo . In that case, I think MeerKATHI should be improved not to allow such mismatch.
@PeterKamphuis problem with xfer remains to be solved. Hopefully it goes away with Stimela 1.2.3.
I would have thought that time-int = 16 and freq-int = 0 (same as in the calibration run) would have made more sense. Can anybody comment?
In general should be fine (as long as there is no time averaging involved?). Do we need to set these two parameters when transferring the gains at all, though? @bennahugo , @SpheMakh , @o-smirnov could you please confirm?
as long as there is no time averaging involved
in my case I only averaged in freq, and then transferred the gains to the non-freq-averaged .MS
When running in apply (xfer-from) mode the time-int and freq-int will be the "resolution" at which the pre-computed gains are interpolated. I had good results with 1 and 1 for both these parameters when loading coarse channel gains onto fine channel MeerKAT-16 data, but that was a long time ago and I haven't tried it since. I would guess that ultimately the performance of this trick will depend on the SNR of your gain solutions.
I was also using ar
not ac
to subtract the continuum at the same time.
@paoloserra Well the error I get must be something in my settings as the master runs fine when I transfer_apply_gains.
@PeterKamphuis that's good news.
When running in apply (xfer-from) mode the time-int and freq-int will be the "resolution" at which the pre-computed gains are interpolated.
OK, so in case I don't want to interpolate but simply copy and apply the frequency-independent gains from a fat-channel to a fine-channel .MS using 1,1 is wrong.
@paoloserra If the interpolation is done correctly it should not make a difference right? The interpolation of a single number would be the same for all frequencies. Or am I missing something?
It will still interpolate in time. That might be OK, but I think we should have control over that.
Ah yes, but the time intervals are the same between the two sets so just setting that to the int of the calculated solution should not be an issue.
Not sure I understand what you mean. I've solved with time-int 16. If I transfer with time-int 1 (current hardcoded behaviour in MeerKATHI) I think cubical will interpolate in time. I see no reason to do this, and anyway the user should be in control.
It will still interpolate in time.
At the risk of stating the obvious, freq interpolation is I think also on by default, although I think with your settings it will not do anything as you've got a single value for the entire frequency band.
That might be OK, but I think we should have control over that.
I agree. If you're doing time/freq re-binning then these parameters also need exposing to the user, as in that mode it has to be a solve-and-save followed by a load-and-apply operation.
@paoloserra Yes I meant to set the time-int to something else than 1. So user controlled. I can work this in on Monday with a default setting of 1,1.
@PeterKamphuis please check with @KshitijT as I think he was also planning to make some changes there.
@PeterKamphuis please check with @KshitijT as I think he was also planning to make some changes there.
@PeterKamphuis , I got this, I anyway want to add a few more things in the selfcal options - unless you want to implement some other stuff as well.
Not sure I understand what you mean. I've solved with time-int 16. If I transfer with time-int 1 (current hardcoded behaviour in MeerKATHI) I think cubical will interpolate in time.
Correct.
If you want to apply exactly the same gains (i.e. "stepwise" in time/freq, same gain value over each original solution interval), then the chunks/intervals in the apply step must exactly match those at the solve step. If you're trying to transfer solutions from "fat channels" to "thin channels", you'll need to multiply freq-int
by whatever the "fattening factor" is to ensure that exact match.
In this scenario, I would also recommend using load-from
and not xfer-from
. The difference is that load-from
never interpolates, and complains if there is a mismatch in the grids. Which is I think what you want here -- if you get it wrong, you want to see an error message, not quiet behind-the-scenes interpolation.
@o-smirnov Are you saying that load-from is still preferred when the datasets (used for calibration vs applying to) have different resolutions? I thought that would not work?
And when we just apply does that still mean the chunks need to be larger than the intervals? That could lead to memory problems when we have large intervals on averaged datasets to create the calibration and that are subsequently applied to the full data set. Or not?
Are you saying that load-from is still preferred when the datasets (used for calibration vs applying to) have different resolutions? I thought that would not work?
It will work if the resolutions are exact multiples of each other, and you incorporate the multiples into your intervals accordingly (or use physical units -- i.e. seconds and MHz -- to specify intervals and chunks).
The exact logic is as follows:
Split the data into chunks as per time-chunk
and freq-chunk
(and chunk-by
)
Split each chunk into intervals as per time-int
and freq-int
Compute the time/freq midpoint of each interval
Lookup this time/freq point in the solution table:
for load-from
, load on exact match, else complain
for xfer-from
, try to interpolate
So as long as you end up with the identical set of intervals after steps 1 and 2, load-from
should work. As long as the chunks are multiples (or equal to) the original intervals, it should work. But if the chunks get smaller, then of course it stops working.
@KshitijT Well I was working on a restore function that also does the gains. In anycase if you already have it I'll wait for that and incorporate it.
@paoloserra @KshitijT @o-smirnov So I traced my error back to having the matrix-type set to phase-diag, which is what it was set to when creating the solutions but apparently that does not work when applying them.
@KshitijT as part of your changes can you please make the transfer_apply_gains settings mentioned in @o-smirnov 's posts available to users? One option would be to have something like;
transfer_apply_gains:
enable: true
transfer_to_label: corr
time-chunk: 128
freq_chunk: 0
interpolate:
enable: true
time-int: 16
freq-int: 0
where cubical runs with load-from
or xfer-from
depending on whether interpolate
is enabled. In case it is disabled, time-int and freq-int should be automatically set to be consistent with the values used when calculating the gains.
Yup, will do.
@o-smirnov I was wondering, since we do not transfer the cubical flags to the high resolution datasets should we not always use xfer-from when transferring the gains to avoid have blocks that are not self called?
Good point @PeterKamphuis. If you have fully flagged time/freq intervals for some antennas in the low-res dataset, there will be no gain solutions for those intervals, and --load-from
will load nothing (leaving you with a unity gain). --xfer-from
will at least try to interpolate something sensible from the surrounding valid blocks.
So conceptually, if the high-res flags are allowed to be a subset of the low-res flags, then there's no way to avoid interpolation entirely... not sure what the best way forward then is. Opinions?
I don't know if we should use on Cubical flags transfer to high-res data (since for example it would depend on low SNR, solution intervals etc).
The original idea behind transfer-gains was to estimate gains from (possibly broadband) continuum data and apply it to the "HI" data. In cases of strong HI sources this would imply flagging them out in the "continuum" data so that the HI emission is not absorbed in the gains and then safely apply the estimated gains to the hi-res data (with HI not flagged, and not necessarily broadband - of course this depends on knowing a-priori where the HI source(s) is) and go happily to make an HI image.
I wonder what will happen in such a case if the solution interval in frequency is small enough to be comparable to the HI source width - we'll end up with unity gains if we use --load-from, right? Then I would say --xfer-from is better.
Following the calibration run:
Running: gocubical --sol-jones G --g-clip-high 2.0 --weight-column WEIGHT --out-overwrite 1 --dist-ncpu 8 --data-time-chunk 128 --madmax-threshold 0,10 --g-time-int 120 --data-ms /home/pserra/msdir/jw00_mk64_4k-target_crosscalavg.ms --out-casa-gaintables 1 --model-list MODEL_DATA --data-column DATA --g-solvable 1 --g-clip-low 0.5 --model-ddes auto --dist-max-chunks 0 --madmax-enable 0 --dist-nworker 0 --bbc-save-to /home/pserra/output/continuum/selfcal_products/bbc-gains-2-jw00_mk64_4k-target_crosscalavg.parmdb --madmax-estimate corr --g-save-to /home/pserra/output/continuum/selfcal_products/g-gains-2-jw00_mk64_4k-target_crosscalavg.parmdb --out-plots 1 --madmax-plot 0 --out-name /home/pserra/output/continuum/selfcal_products/kh_jw00_mk64_4k-target_crosscalavg_2_cubical --montblanc-dtype float --g-type phase-diag --sol-term-iters 5,0 --g-freq-int 0 --out-mode sc
I transfer and apply the gains to the fine channel dataset with:
Running: gocubical --sol-jones G --montblanc-dtype float --dist-nworker 0 --bbc-save-to file --weight-column WEIGHT --out-mode ac --sol-term-iters 5,0 --out-plots 1 --out-overwrite 1 --data-ms /home/pserra/msdir/jw00_mk64_4k-target_crosscal.ms --g-xfer-from /home/pserra/output/continuum/selfcal_products/g-gains-2-jw00_mk64_4k-target_crosscalavg.parmdb --data-column DATA --data-time-chunk 128 --out-name /home/pserra/output/continuum/selfcal_products/jw00_mk64_4k-jw00_mk64_4k-target_crosscal.ms_2_cubical --dist-ncpu 8 --out-casa-gaintables 0 --model-ddes auto --dist-max-chunks 0
All this is using the standard MeerKATHI settings.
The resulting CORRECTED_DATA is padded with zeros both in time and frequency while DATA was not (see attached images). I don't know why. Any idea? Thanks!