caracal-pipeline / caracal

Containerized Automated Radio Astronomy Calibration (CARACal) pipeline
GNU General Public License v2.0
28 stars 6 forks source link

Concatenating MS: Table column 'unknown' error #1578

Open kendak333 opened 2 months ago

kendak333 commented 2 months ago

I've processed two epochs of data on a field separately using caracal, up to selfcal. I'm trying to concatenate the files so that I can test additional calibration on the combined data before imaging. However when I try to do this using CASA's concat task, I get the following error:

RuntimeError: Table column SIGMA_SPECTRUM is unknown

Looking into CASA documentation, and it does seem to know that SIGMA_SPECTRUM is a valid column name. Got no help via them, so wondering if it's caracal-processing thing?

KshitijT commented 2 months ago

@kendak333 is this standalone casa outside caracal or is this the concat using the transform worker?

kendak333 commented 2 months ago

@KshitijT The error is from standalone casa (versions 5.6 and 6.5). When I use the caracal transform worker

transform: enable: true split_field: enable: false concat: enable: true col: corrected

I get a validation fail for the yaml:

transform: - Key 'col' was not defined. Path: '/transform/concat'

The readthedocs says 'col' is a string, and I've tried with quotes, without quotes, I've tried for 'all' as well.

kendak333 commented 2 months ago

Update: the yaml is only validated if I switch 'split_field' to true, but then it complains because it, quite rightly, can't find a gaincal (this is a target-only dataset, which I'm wanting to combine with a different epoch of the same target).

In doing some more sleuthing, it seems there was a change in MeerKAT MS design, from SIGMA_SPECTRUM to WEIGHT_SPECTRUM, which is possibly why standalone CASA is spitting out the error (the two epochs are two years apart). I'll probably have to fix this before I try concat-ing by any method.

KshitijT commented 2 months ago

I am not sure if CASA should complain as long as both MSs have the same columns? In the initial reductions of the datasets (I am assuming with CARACal) did you switch on the specweights field in the prep worker?

kendak333 commented 2 months ago

Yes - the extract from the prep worker below:

specweights: enable: True mode: uniform calculate: statsfile: use_package_meerkat_spec weightcols: WEIGHT, WEIGHT_SPECTRUM noisecols: SIGMA, SIGMA_SPECTRUM apply: True

I've just tried to concatenate two epochs observed during the same project (one day apart), and am getting the same error "SIGMA_SPECTRUM unknown" issue. Checking with a colleague who processed other epochs of the same set but have managed to concat without issue.

paoloserra commented 2 months ago

Hi @kendak333

If I understand correctly, you're interested in concatenation along the time axis, which CARACal does not support. CARACal concatenates along the spectral axis only. See the description at https://caracal.readthedocs.io/en/latest/manual/workers/transform/index.html#concat .

I have a question though. Do you really need to concatenate? In CARACal, you should be able to image N input MS jointly (as long as the target is the same), and thus get a better sky model, which you can use to calibrate the individual MS files. Why do you need to concatenate them?

kendak333 commented 2 months ago

@paoloserra Ok thanks - I did notice eventually that the caracal concat won't do time axis.

Regarding the need or not to concatenate, I've made a combined image using both epochs, and the artefacts are more prominent in that than the individual ones. I've had good success with other fields by combining epochs and running the selfcal worker on the combined ms, so was hoping to do the same here. I'll look into what you suggest though.

As an aside for others looking at this thread - the concat apparently isn't working (with just the CASA concat) because the time-averaging is different in the two epochs (2s vs 4s). I have a weird hurdle where, after averaging the 2s to 4s, one scan does have 4s averaging, and the other inexplicably has 3.99s, so it still won't combine! :)

paoloserra commented 2 months ago

Yeah, I've had some pain with such minor metadata differences also wrt the channel width of FITS cubes -- not nice.

But back to the selfcal results, do you have any ideas about why joint imaging gives you worse artefacts? Was that in CARACal?

kendak333 commented 2 months ago

It might be the case that they're there in (one of) the single-epoch images but below/near the noise, and the improvement in noise from the combination makes them pop up more starkly, though I'm not sure that'd explain the level I'm seeing - see below. The combined image is from sending the corrected data from both MSs into wsclean, no additional calibration applied. combined_image_test Note no DD calibration has been applied at any stage so far.

paoloserra commented 2 months ago

I agree that the artefact in the left panel is too bright to be buried under the noise in the middle and right panels.

It almost looks like the left panel shows an image from DATA rather than CORRECTED. That's really weird.

kendak333 commented 2 months ago

Definitely imaging CORRECTED_DATA (went and double checked!) - we're seeing this kind of behaviour quite a bit when we image multiple epochs together, which is why we're trying to go the selfcal-after-combining route.

paoloserra commented 2 months ago

Really puzzling. But since this is important, we could definitely add a time concat option in CARACal if you are interested.