caracal-pipeline / caracal

Containerized Automated Radio Astronomy Calibration (CARACal) pipeline
GNU General Public License v2.0
28 stars 6 forks source link

Change in resolution with Wsclean when activating sofia masking #1468

Closed AnnalisaB closed 1 year ago

AnnalisaB commented 1 year ago

Hi I'm doing selfcal on MeerKAT L band data. I tried activating sofia automasking, and I notice that image_0 and Image_1 have a different resolution, though the imaging parameters are the same. I noticed, though, that image_0 is not using the multiscale clean which is activated in the Caracal parset

image_0 has BMAJ = 0.00422229004531048 deg
BMIN = 0.001909209278464 deg

while image_1 has a resolution of BMAJ = 0.0031890872723294 deg
BMIN = 0.0017589251768535 deg

Anyone knows why it happens?

Here the wsclean commands used for image_0

wsclean -name image_0 -j 64 -mem 100 -absmem 100.0 -weight briggs -0.5 -no-mfs-weighting -taper-gaussian 6 -size 8000 8000 -scale 1.2asec -channels-out 20 -nwlayers-factor 3 -pol I -data-column DATA -niter 1000000 -auto-threshold 0.5 -auto-mask 6 -gain 0.07 -mgain 0.8 -join-channels -fit-spectral-pol 3 -padding 1.3 /stimela_mount/msdir/1640320872-A2034-corr_avg_pcal.MS

and for image_1, i.e. after masking

wsclean -name /image_1 -j 64 -mem 100 -absmem 100.0 -weight briggs -0.5 -taper-gaussian 6 -size 8000 8000 -scale 1.2asec -channels-out 20 -nwlayers-factor 3 -pol I -data-column CORRECTED_DATA -niter 10000 -auto-threshold 0.5 -gain 0.07 -mgain 0.8 -join-channels -multiscale -multiscale-scales 0,2,4,8,16 -fits-mask /stimela_mount/output/masking/image_0_clean_mask.fits -fit -spectral-pol 3 -padding 1.3 -save-source-list /stimela_mount/msdir/1640320872-A2034-corr_avg_pcal.MS

thanks in advance Annalisa

paoloserra commented 1 year ago

Hi Annalisa. The 0th iteration has a few hardcoded settings. The only point of this iteration is to create a reasonable image on which to run SoFiA to make a clean mask. For this reason, some of the user settings applied to subsequent iterations are ignored in the 0th one. Admittedly, we could include a few of them in the 0th iteration, too, like the multi-scale cleaning you mention. However, using multi-scale cleaning should not really affect the restoring beam size.

In the commands you've sent I see the 0th iteration is missing the -no-mfs-weighting option. Could that be the reason of the different resolution? Have you compared the dirty beams?

This said, I'm puzzled. In my CARACal runs -no-mfs-weighting is enabled in the 0th iteration, too, and the resolution of image_0 is the same as that of image_1. I wonder whether there is some wicked bug such that -no-mfs-weighting is ignored in the 0th iteration only one multi-scale cleaning is used.

Could you maybe share the full CARACal log?

KshitijT commented 1 year ago

Along with -no-mfs-weighting, the difference of flagging between DATA and CORRECTED_DATA could cause the beam size to vary?

paoloserra commented 1 year ago

Well spotted, you're right, and that's also a bug. Iteration-1 should use DATA.

I think this one, too, might be a bug specific to the case when multi-scale cleaning is activated. I've looked at my logs (without multi-scale clean) and I see that both 0th and 1st iterations use DATA.

AnnalisaB commented 1 year ago

Thanks for the fast replies

I'm taking the wsclean command from the header of the images In the Caracal parset, I have img_mfs_weighting: true so that should be on both in iter_0 ad iter_1

Thanks @KshitijT, I didn't notice the difference on the datacolumn either, as I didn't remember I have already a corrected data column from a previous selfcal

Is there a way I can force iter 1 to use data? I don't see a way to specify the data column in wsclean in the Caracal manual

paoloserra commented 1 year ago

I'm taking the wsclean command from the header of the images In the Caracal parset, I have img_mfs_weighting: true so that should be on both in iter_0 ad iter_1

OK. Again, I think this might be a bug that only happens when using multi-scale clean. We will look into this asap.

Thanks @KshitijT, I didn't notice the difference on the datacolumn either, as I didn't remember I have already a corrected data column from a previous selfcal

Is there a way I can force iter 1 to use data? I don't see a way to specify the data column in wsclean in the Caracal manual

See the col parameter in https://caracal.readthedocs.io/en/latest/manual/workers/selfcal/index.html#image

You can set it to a list of value, which gives the column to use for iteration 1, 2, ..., N. Iteration 0 should use the value given for iteration 1. The default list [DATA, CORRECTED_DATA] should have worked for you.

AnnalisaB commented 1 year ago

Thanks again, Paolo!

I'm taking the wsclean command from the header of the images In the Caracal parset, I have img_mfs_weighting: true so that should be on both in iter_0 ad iter_1

OK. Again, I think this might be a bug that only happens when using multi-scale clean. We will look into this asap.

As soon as the selfcalrun is done I can also start a run without multiscale if that helps pinning down the possible bug

Thanks @KshitijT, I didn't notice the difference on the datacolumn either, as I didn't remember I have already a corrected data column from a previous selfcal Is there a way I can force iter 1 to use data? I don't see a way to specify the data column in wsclean in the Caracal manual

See the col parameter in https://caracal.readthedocs.io/en/latest/manual/workers/selfcal/index.html#image

You can set it to a list of value, which gives the column to use for iteration 1, 2, ..., N. Iteration 0 should use the value given for iteration 1. The default list [DATA, CORRECTED_DATA] should have worked for you.

Ah, totally my fault, I was looking under img_ options

Many thanks again, Paolo

paoloserra commented 1 year ago

As soon as the selfcalrun is done I can also start a run without multiscale if that helps pinning down the possible bug

that would be really helpful, thanks

AnnalisaB commented 1 year ago

I've run the tests de-activating multi-scale, with and without -mfs-weighting

With no multiscale the imaged column is DATA in both image0 and image1

The change in resolution, though was not due to the imaging of DATA vs CORRECTED DATA but to the mfs-weighting option. Image_0 is done with -no-mfs-weighting even if I have img_mfs_weighting: True in the selfcal worker, so this is one of the parameters that is hardcoded mentioned by Paolo above

Perhaps, it would be better to have the same settings in image_0 and image_1 ?

Thanks again for your help

paoloserra commented 1 year ago

Perhaps, it would be better to have the same settings in image_0 and image_1 ?

Agreed. I will do that.

Fil8 commented 1 year ago

The issue about mfs-weighting was that the default is False (which happens unless you specify img-mfs-weighting: True in the config file).

When you want mfs-weighting you have to set it to true, but as @AnnalisaB spotted image_0 had hardcoded this line which was forcing image_0 not to do mfs-weighting even if requested.

https://github.com/caracal-pipeline/caracal/blob/70d83eaac9bfde5303b66b9e7bb875bc5cdeb08f/caracal/workers/selfcal_worker.py#L455

I deleted that line in an upcoming pull request.

Fil8 commented 1 year ago

You can set it to a list of value, which gives the column to use for iteration 1, 2, ..., N. Iteration 0 should use the value given for iteration 1. The default list [DATA, CORRECTED_DATA] should have worked for you.

Actually, image_0 will always look for DATA. I think is logical because it is image_0.

https://github.com/caracal-pipeline/caracal/blob/4098df5635b6946319f6f3cd674cc25fd4ffd531/caracal/workers/selfcal_worker.py#L452

I don't think we should change this.

If you want to start from CORRECTED_DATA I would suggest to set start_iter: 2 and you will start the selfcal from image 1.

paoloserra commented 1 year ago

I think image_0 should do whatever image_1 does, since there is no self calibration in between them

Fil8 commented 1 year ago

this is ok as far as the user remembers that if the dataset has no CORRECTED_DATA column wsclean will look for the DATA column. I fix this in the same pull request

paoloserra commented 1 year ago

There was also something funny with multiscale cleaning. I'm testing now.

Fil8 commented 1 year ago

that I did not check, let me know :)

paoloserra commented 1 year ago

this is ok as far as the user remembers that if the dataset has no CORRECTED_DATA column wsclean will look for the DATA column

I think that's fine since as it will happen both for image_0 and image_1

paoloserra commented 1 year ago

PR #1488 should take care of all the things that were not working well for @AnnalisaB .

@Fil8 , I've made a separate PR just in case this is easier to manage (I'm a fun of small PR), but I leave it to you to decide how to proceed.

Thanks again for making a good start, which made me feel ashamed and look into the remaining issues :)

Fil8 commented 1 year ago

good good I checked, PR is in place. I close this issue