caracal-pipeline / caracal

Containerized Automated Radio Astronomy Calibration (CARACal) pipeline
GNU General Public License v2.0
28 stars 6 forks source link

PyBDSM model vs clean components #31

Closed paoloserra closed 5 years ago

paoloserra commented 7 years ago

Following clean with a clean mask and source finding with PyBDSM we obtain a model which does not coincide with the set of cleaned sources (it may have more objects or less objects -- I've seen both).

At the first iteration, where I like to clean very conservatively, PyBDSM finds several sources that have not been cleaned. These uncleaned sources are detected, parameterised, used for selfcal and subtracted from the data.

I don't feel very comfortable with it.

I would like to restrict the PyBDSM model to cleaned sources. In my mind one should selfcalibrate with (and, if desired, subtract) only sources that have been cleaned.

Any suggestions on how to achieve this within our framework? I looked at the PyBDSM options but couldn't find anything obvious.

SpheMakh commented 7 years ago

PyBDSM takes a detection image. This image is used to determine which regions to fit when making the catalog. We could give the model image as the detection image.

paoloserra commented 7 years ago

There you go, I'll try that

SpheMakh commented 7 years ago

Not sure how using the model image will work for point sources though. In the case of a point of a source, you will have a single pixel, which could be a problem for PyBDSM. Maybe convolve the model image with the clean beam first?

gigjozsa commented 7 years ago

What is worrying about using an uncleaned source if you re-iterate later on? In the first iteration(s) we will pick up point sources mostly (only).

paoloserra commented 7 years ago

@SpheMakh Yes, doesn't work that well. It seems that wsclean and casa clean do not output the model convolved with the Gaussian PSF. Correct?

@gigjozsa Yes I could stop being so anal and simply start all over at the next iteration, i.e., set output-data = CORR_DATA in calibrator

SpheMakh commented 7 years ago

@paoloserra yes they do not produce that image.

KshitijT commented 7 years ago

@paoloserra , one way to ensure that we don't include uncleaned sources in the sky model is keeping the clean threshold (the auto-threshold parameter) and the thresh_pix parameter of pybdsm the same. That and enough number of clean iterations should ensure that our sky model is satisfactory.

paoloserra commented 7 years ago

@KshitijT I've tried something similar but there are two issues.

  1. For the first clean interations I like the clean mask to be defined as a function of image peak, and PyBDSM cannot do that.

  2. Even when making clean masks as a function of rms noise, the formal rms of the initial, dirty image used to make the clean mask is a lot higher than that of the first cleaned+restored image fed to PyBDSM.

I'm trying to do what Josh implied with his comment. That is:

After a few iterations selfcal should converge and it the details of what we do will not matter anymore.

gigjozsa commented 7 years ago

@paoloserra this is what I implied, indeed.

paoloserra commented 7 years ago

Sorry guys, trying this stuff but I'm getting the following error:

flag-ms.py: MS does not contain a BITFLAG column. Please run the addbitflagcol utility on this MS. Successful readonly open of default-locked table /home/kat/msdir/1491463063_cal.ms: 26 columns, 28560 rows running: flag-ms.py --timeslot-multiplier 1 --data-column CORRECTED_DATA --unflag FLAG0 --chunk-size 20000 /home/kat/msdir/1491463063_cal.ms

Related to my recent install of Stimela release-029?

KshitijT commented 7 years ago

@paoloserra , try without the step "unflag_selfcalflags"; that was put there in case the MS being used had flags leftover from previous selfcal rounds, but it is not needed at that particular stage and should be removed from the pipeline.

paoloserra commented 7 years ago

I get a similar error when backing up the initial flags.

flag-ms.py: MS does not contain a BITFLAG column. Please run the addbitflagcol utility on this MS. Successful readonly open of default-locked table /home/kat/msdir/1491463063_cal.ms: 26 columns, 28560 rows running: flag-ms.py --timeslot-multiplier 1 --data-column CORRECTED_DATA --flagged-any legacy+L --flag legacy --chunk-size 20000 /home/kat/msdir/1491463063_cal.ms

Note that I didn't use to get this error before.

SpheMakh commented 7 years ago

The MS is missing a BITFLAG column, running MSPREP should fix this. Actually, none of these FLAGMS steps are needed. I'll commit fixes to the 2GC script tomorrow morning.

paoloserra commented 7 years ago

@SpheMakh , this error didn't use to occur until a few hours ago. I'm very confused. Is the issue with Stimela or with our script? Anyway, for the moment I've commented out unflag_selfcalflags and backup_initial_flags and am able to continue my tests with Stimela release-029.

SpheMakh commented 7 years ago

I'm not sure why it ran before, maybe FLAGMS wasn't throwing the correct exit status. Its hard to say, but its doing the correct thing now. I'm still going through the pipeline, and fixing some inconsistencies. Its taker longer than I thought it would.

paoloserra commented 7 years ago

No rush.

I'm also working on the 2GC part. I'd like to test a slightly different (probably slower but safer) approach. I will wait for you to push your changes.

SpheMakh commented 7 years ago

I have an even more diabolical plan for the 2GC selfcal:

  1. Image data with wsclean. Use channelsout=3 (and make cube), enable fit-spectral-pol. At this step, you want to clean only the bright sources, but clean them deeply. So set auto-mask=10, and ato-threshold=0.5)
  2. Run pybdsm on the cube, and extract a sky model that also has spectral indices (set the thresholds high, to get only the brightest sources)
  3. Calibrate data with sky model from above step
  4. Image corrected residual data from above. Same settings as 1, but clean much deeper. (auto-mask=3, auto-threshold=0.5). Save clean model in MODEL_DATA
  5. Calibrate again using the model from step 2 and clean model in MODEL_DATA.

Motivation: Subtracting point-like sources from data sometimes leaves holes in the image, especially when the compact emission is within extended emission (see images below ). This makes it difficult to characterise the emission in further source finding steps. One could just image the the CORR_DATA instead, but then the sidelobes of the brighter sources will dominate the fainter emission.

pybdsm_sub pybdsm_sub2

I think a parametric model for the bright point-like sources and using clean components for the rest is neat way to go. This of course depends on having a good masking technique, which I think wsclean already has.

paoloserra commented 7 years ago

@SpheMakh , Using PyBDSM source parameters for bright sources and WSClean components for faint sources is a good idea. You would have to show me how to use both together in step 5, but in principle it should work.

However, I still do not like the proposal of imaging the corrected residual at such an early stage. During these first iterations, the rms measured by WSClean is likely different from that measured by PyBDSM, especially when not cleaning too aggressively. Therefore, the set of WSClean-ed sources is not equal to the set of PyBDSM-detected sources, and we'd be subtracting uncleaned sources from the data.

I continue to think that the first few imaging+selfcal iterations should be set up as a function of image peak, not image noise, and that during the first few iterations we should always restart from scratch after selfcal, using the corrected data and not the corrected residuals.

If we have a 200 sigma source in the image (not unlikely), do we really want to make a first clean down to 5% of that source? When starting with a crap gain calibration (which I hope we will at some point, otherwise it means we are waisting time on excessively frequent calibrator scans) we would risk to include spurious sources in the model.

So I still think that, in the first iterations, we should make a clean mask as a function of image peak, then clean, and somehow convince PyBDSM to only detect WSClean-ed sources.

Speaking of which, would it be possible to add in Stimela's WSClean an extra switch that enables writing convmodel.fits = image.fits - residual.fits to disk? Then we could use convmodel.fits as detection_image for PyBDSM.

(I know we could add the calculation of convmodel.fits within our script between a WSClean run and the subsequent PyBDSM run. But then we would have to split the recipe in two parts, one before and one after this calculation -- and this every time we make this calculation. I think that's not ideal.)

SpheMakh commented 7 years ago

I continue to think that the first few imaging+selfcal iterations should be set up as a function of image peak, not image noise, and that during the first few iterations we should always restart from scratch after selfcal, using the corrected data and not the corrected residuals.

If we have a 200 sigma source in the image (not unlikely), do we really want to make a first clean down to 5% of that source? When starting with a crap gain calibration (which I hope we will at some point, otherwise it means we are waisting time on excessively frequent calibrator scans) we would risk to include spurious sources in the model.


Agreed

SpheMakh commented 7 years ago

@SpheMakh , Using PyBDSM source parameters for bright sources and WSClean components for faint sources is a good idea. You would have to show me how to use both together in step 5, but in principle it should work.


To include the clean components (assuming they are saved in the MODEL_DATA column), you just set "add-vis-model" to True in CALIBRATOR

paoloserra commented 7 years ago

excellent, thanks!

SpheMakh commented 7 years ago

Speaking of which, would it be possible to add in Stimela's WSClean an extra switch that enables writing convmodel.fits = image.fits - residual.fits to disk? Then we could use convmodel.fits as detection_image for PyBDSM.


I think its better to use the FITSTOOL cab to get the difference. Here is an example,

pipeline.add('cab/fitstool', 'diff', {
"image" : ['stimela-example_robust-2-{:s}.fits:output'.format(a) for a in ['image', 'residual']], "diff" : True, "output": "convolved-model.fits" }, input=INPUT, output=OUTPUT, label='diff:: Diff')

SpheMakh commented 7 years ago

On why the BITFLAG column was missing: The PREPMS step wasn't being added to the recipe. It should be added below this line

gigjozsa commented 7 years ago

Concerning peak detection values. If the absolute is not possible, it might be possible to use a silly trick: calculate the rms and divide the maximum by it and use this as a threshold?

paoloserra commented 7 years ago

@SpheMakh , I'm following your advice on how to create a convolved-model FITS. However, fitstool crashes as follows:

INFO:STIMELA:Instantiating container [diff_00-139773772129200149910427101]. The container ID is printed below. running: docker create --user 1000 -v /scratch2/paolo/meerkathi/meerkathi-venv/local/lib/python2.7/site-packages/stimela:/utils:ro -v /scratch2/paolo/meerkathi/pipeline/stimela_parameter_files:/configs:ro -v /etc/group:/etc/group:ro -v /etc/passwd:/etc/passwd:ro -v /etc/shadow:/etc/shadow:ro -v /etc/sudoers.d:/etc/sudoers.d:ro -v /scratch2/paolo/meerkathi/pipeline/msdir:/home/kat/msdir:rw -v /scratch2/paolo/meerkathi/pipeline/input:/input:ro -v /scratch2/paolo/meerkathi/pipeline/output:/home/kat/output:rw -e CONFIG=/configs/diff_00-139773772129200149910427101.json -e MSDIR=/home/kat/msdir -e INPUT=/input -e OUTPUT=/home/kat/output -e LOGFILE=/home/kat/output/log-diff_00.txt --name diff_00-139773772129200149910427101 --shm-size 1gb kat_cab/fitstool 4a0a8f673c369d06f964be9a9d8a37286c2c4a757406750a7a9413fbb142aa1a running: docker start -a diff_00-139773772129200149910427101 Traceback (most recent call last): File "/usr/local/bin/fitstool.py", line 31, in main() File "/usr/local/lib/python2.7/dist-packages/Owlcat/FitsTool.py", line 265, in main images = [ pyfits.open(img) for img in imagenames ]; File "/usr/lib/python2.7/dist-packages/pyfits/hdu/hdulist.py", line 124, in fitsopen return HDUList.fromfile(name, mode, memmap, save_backup, kwargs) File "/usr/lib/python2.7/dist-packages/pyfits/hdu/hdulist.py", line 266, in fromfile save_backup=save_backup, kwargs) File "/usr/lib/python2.7/dist-packages/pyfits/hdu/hdulist.py", line 786, in _readfrom ffo = _File(fileobj, mode=mode, memmap=memmap) File "/usr/lib/python2.7/dist-packages/pyfits/file.py", line 132, in init self._open_filename(fileobj, mode, clobber) File "/usr/lib/python2.7/dist-packages/pyfits/file.py", line 487, in _open_filename self._file = fileobj_open(self.name, PYFITS_MODES[mode]) File "/usr/lib/python2.7/dist-packages/pyfits/util.py", line 754, in fileobj_open return open(filename, mode) IOError: [Errno 2] No such file or directory: '/home/kat/output/meerkathi-comb-03jul_cont_00_01-convmodel.fits' Traceback (most recent call last): File "/code/run.py", line 59, in utils.xrun("fitstool.py", args+[inimage, outimage or ""]) File "/utils/utils/init.py", line 72, in xrun running: fitstool.py --diff /home/kat/output/meerkathi-comb-03jul_cont_00_01-image.fits /home/kat/output/meerkathi-comb-03jul_cont_00_01-residual.fits /home/kat/output/meerkathi-comb-03jul_cont_00_01-convmodel.fits raise SystemError('%s: returns errr code %d'%(command, process.returncode)) SystemError: fitstool.py: returns errr code 1 2017-07-03 19:51:34,129 - STIMELA - INFO - Recipe execution failed while running job diff_00

Note that I was running fitstool with the following dictionary as argument:

{'diff': True, 'image': ['meerkathi-comb-03jul_cont_00_01-image.fits:output', 'meerkathi-comb-03jul_cont_00_01-residual.fits:output'], 'output': 'meerkathi-comb-03jul_cont_00_01-convmodel.fits'}

but somehow the requested output FITS is interpreted as an input FITS, which of course does not exist, hence the error.

SpheMakh commented 7 years ago

I think this might be a similar problem to before. I made a new release of stimela, which includes many of the fixes that came up during these reductions. Lets switch to it:

pip install --upgrade --force-reinstall stimela
stimela pull
stimela clean -ac # clear out old CAB images
stimela build --no-cache

Then re-run the pipeline

paoloserra commented 7 years ago

OK that got fixed.

Next problem:

I now have two ways to make a detection image (option 1: convolved model, option 2: clean mask). Regardless of which one is better, I can use the detection image to force PyBDSM to find and model cleaned sources only. As we discussed, this is what we'd like to do for bright sources.

Unfortunately, PyBDSM seems to always want to calculate the rms of the detection image (don't ask me why) and finds rms = zero (possibly on subregions) because the detection image is >90% zeros.

I've tried to avoid this problem by setting the rms box size to the size of the image, switching off the calculation of the rms image (i.e., constant rms across the image), or setting the rms value manually. Unfortunately, nothing seemed to work. The error is attached below.

The only solution I've found is to replace all zeros in the detection image with nan's. However, I cannot find a fitstool option which would allow me to do this replacement within the Stimela recipe. Any suggestions?

ERROR

--> Determining islands from detection image --> Opened '/home/kat/output/meerkathi-comb-04jul_cont_00_01-convmodel.fits' Image size .............................. : (1024, 1024) pixels Number of channels ...................... : 1 Number of Stokes parameters ............. : 1 Beam shape (major, minor, pos angle) .... : (2.44345e-02, 1.42127e-02, 174.5) degrees Frequency of image ...................... : 1409.822 MHz Number of blank pixels .................. : 0 (0.0%) ERROR: A region with an unphysical rms value has been found. Please check the input image. Traceback (most recent call last): File "/code/run.py", line 53, in img = bdsm.process_image(image, img_opts) File "/usr/lib/python2.7/dist-packages/lofar/bdsm/init.py", line 243, in process_image img.process(kwargs) File "/usr/lib/python2.7/dist-packages/lofar/bdsm/image.py", line 126, in process success = interface.process(self, **kwargs) File "/usr/lib/python2.7/dist-packages/lofar/bdsm/interface.py", line 53, in process _run_op_list(img, op_chain) File "/usr/lib/python2.7/dist-packages/lofar/bdsm/init.py", line 149, in _run_op_list op(img) File "/usr/lib/python2.7/dist-packages/lofar/bdsm/islands.py", line 74, in call success = _run_op_list(det_img, det_chain) File "/usr/lib/python2.7/dist-packages/lofar/bdsm/init.py", line 149, in _run_op_list op(img) File "/usr/lib/python2.7/dist-packages/lofar/bdsm/preprocess.py", line 66, in call mean, rms, cmean, crms, cnt = bstat(image, mask, kappa) File "/usr/lib/python2.7/dist-packages/lofar/bdsm/functions.py", line 2222, in bstat raise RuntimeError("A region with an unphysical rms value has been found. " RuntimeError: A region with an unphysical rms value has been found. Please check the input image. 2017-07-04 11:47:34,448 - STIMELA - INFO - Recipe execution failed while running job sky_model_00

SpheMakh commented 7 years ago

set 'blank_limit' to a very low value in PYBDSM (say 1e-9) and the error should go away.

paoloserra commented 7 years ago

Great, thanks!

Any thoughts about whether to use the clean mask or the convolved model as detection_image? As far as I understand this should be irrelevant as source parameters are measured from the input image anyway.

KshitijT commented 7 years ago

If the main consideration is to not include any un-cleaned emission in the source model, then we should stick to the convolved model than the clean mask?

paoloserra commented 7 years ago

Sure, with one complication though. The convolved model has lots of non-zero pixels extending way outside the clean mask (just as a consequence of convolution). Therefore, if we use the convolved model we also have to choose where to clip it in order to form a detection_image. (See Sphe's comment on blank_limit above.)

The mask wouldn't have this issue because it's just 0's and 1's.

I've seen that clipping the convolved_model at some very low value, e.g., 1e-6 Jy/beam, can include too many pixels in the detection_image (and, occasionally, artefacts near a bright cleaned source). So I'd like to explore the use of clean masks as input to PyBDSM instead.

I've tried this already. The detection_image needs to be the sum of all clean masks used in that particular image+selfcal iteration. As far as I can see, fitstool doesn't allow one to add images. I can take the mean though, that should work too.

Unfortunately, when I do that I get the PyBDSM error below. Consider that, in this particular case, the mask size is 1300x1300 and, therefore, there are 174 non-blank pixels left in the detection_image. It should have been OK. However, it seems that either npix or npixbeam is = 0.

Any clue?

--> Blanking pixels with values below 1.0e-03 Jy/beam Total number of blanked pixels .......... : 1689826 (100.0%) Traceback (most recent call last): File "/code/run.py", line 53, in img = bdsm.process_image(image, img_opts) File "/usr/lib/python2.7/dist-packages/lofar/bdsm/init.py", line 243, in process_image img.process(kwargs) File "/usr/lib/python2.7/dist-packages/lofar/bdsm/image.py", line 126, in process success = interface.process(self, *kwargs) File "/usr/lib/python2.7/dist-packages/lofar/bdsm/interface.py", line 53, in process _run_op_list(img, op_chain) File "/usr/lib/python2.7/dist-packages/lofar/bdsm/init.py", line 149, in _run_op_list op(img) File "/usr/lib/python2.7/dist-packages/lofar/bdsm/islands.py", line 74, in call success = _run_op_list(det_img, det_chain) File "/usr/lib/python2.7/dist-packages/lofar/bdsm/init.py", line 149, in _run_op_list op(img) File "/usr/lib/python2.7/dist-packages/lofar/bdsm/preprocess.py", line 66, in call mean, rms, cmean, crms, cnt = bstat(image, mask, kappa) File "/usr/lib/python2.7/dist-packages/lofar/bdsm/functions.py", line 2199, in bstat kappa = numpy.sqrt(2.0)erfcinv(1.0 / (2.0*npix/npixbeam)) ZeroDivisionError: float division by zero

SpheMakh commented 7 years ago

I've seen the same problem when using the convolved model as a detection image. But in terms finding regions to look for sources, it should be equivalent to the mask. The mask seems the better option

SpheMakh commented 7 years ago

With regards to the PYBDSM error: Is the blank_limit applied to the detection image, or the input image?

paoloserra commented 7 years ago

PyBDSM error is on the detection_image:

--> Determining islands from detection image --> Opened '/home/kat/output/meerkathi-comb-05jul_cont_00_01-mask.fits' Image size .............................. : (1300, 1300) pixels Number of channels ...................... : 1 Number of Stokes parameters ............. : 1 Beam shape (major, minor, pos angle) .... : (0.00000e+00, 0.00000e+00, 0.0) degrees Frequency of image ...................... : 1409.822 MHz Number of blank pixels .................. : 0 (0.0%) --> Blanking pixels with values below 1.0e-03 Jy/beam Total number of blanked pixels .......... : 1689826 (100.0%) Traceback (most recent call last): File "/code/run.py", line 53, in img = bdsm.process_image(image, **img_opts)

etc etc

SpheMakh commented 7 years ago

If the detection image is the mean of the of the binary masks from the self-cal loop, then if any of the masks making up this mean mask worked in a previous step, the mean mask should work here. Because any unmasked pixel from any of component mask should not have a value < 1/N_masks, which (assuming a small N_mask) shouldn't be bellow the blank_limit of 1e-3.

SpheMakh commented 7 years ago

The beam shape is zero!

Beam shape (major, minor, pos angle) .... : (0.00000e+00, 0.00000e+00, 0.0)
paoloserra commented 7 years ago

That works, but we're now to the next issue.

--> Determining islands from detection image --> Opened '/home/kat/output/meerkathi-comb-06jul_cont_00_01-mask.fits' Image size .............................. : (1300, 1300) pixels Number of channels ...................... : 1 Number of Stokes parameters ............. : 1 Beam shape (major, minor, pos angle) .... : (2.45375e-02, 1.43596e-02, 175.0) degrees Frequency of image ...................... : 1409.822 MHz Number of blank pixels .................. : 0 (0.0%) --> Blanking pixels with values below 9.0e-01 Jy/beam Total number of blanked pixels .......... : 1689826 (100.0%) ERROR: A region with an unphysical rms value has been found. Please check the input image.

If this sounds familiar it's because I'd already reported a similar error above, https://github.com/SpheMakh/meerkathi/issues/31#issuecomment-312840668 . Here the almost 100% 0-valued pixels are being masked. I suspect the issue is that all remaining pixels have value = 1, and therefore the rms is 0. Annoying.

Could somebody explain to me why PyBDSM calculates the rms of the detection_image in the first place?

KshitijT commented 7 years ago

@paoloserra , possibly because Pybdsm still needs to identify islands of emission in the detection image. For this run, is rms value constant or is it required to be calculated?

SpheMakh commented 7 years ago

ahh, you are right @KshitijT. Better to give an RMS value instead of a blank limit

SpheMakh commented 7 years ago

But then calculating the RMS value becomes another issue :(

paoloserra commented 7 years ago

The issue is not that rms is constant, it is that rms is 0.

But yes, I've thought of simply setting the rms to some non-zero value. Unfortunately, those settings seem to be applied to the input image (including the blanking; do we want that?) but not to the detection_image. See below.

It seems to me that the solution would be to have a detection_image which is blank outside the clean mask, and = input image inside the clean mask. Can we do that with fitstool? Maybe divide the input image by the clean mask would do? (= NaN where we divide by zero, = image where we divide by 1)

--> Opened '/home/kat/output/meerkathi-comb-06jul_cont_00_01-image.fits' Image size .............................. : (1300, 1300) pixels Number of channels ...................... : 1 Number of Stokes parameters ............. : 1 Beam shape (major, minor, pos angle) .... : (2.45375e-02, 1.43596e-02, 175.0) degrees Frequency of image ...................... : 1409.822 MHz Number of blank pixels .................. : 0 (0.0%) --> Blanking pixels with values below 9.0e-01 Jy/beam Total number of blanked pixels .......... : 1689955 (100.0%) Flux from sum of (non-blank) pixels ..... : 0.198 Jy --> Calculating background rms and mean images /usr/lib/python2.7/dist-packages/lofar/bdsm/rmsimage.py:124: RuntimeWarning: invalid value encountered in greater_equal act_pixels = (image-cmean)/threshold >= crms /usr/lib/python2.7/dist-packages/lofar/bdsm/rmsimage.py:431: RuntimeWarning: invalid value encountered in less if N.any(rms < 0.0): Derived rms_box (box size, step size) ... : (131, 44) pixels --> Using constant background rms --> Variation in mean image not significant --> Using constant background mean Value of background rms ................. : 1.00000 Jy/beam Value of background mean ................ : 1.00593 Jy/beam --> Expected 5-sigma-clipped false detection rate < fdr_ratio --> Using sigma-clipping ('hard') thresholding Minimum number of pixels per island ..... : 68

--> Determining islands from detection image --> Opened '/home/kat/output/meerkathi-comb-06jul_cont_00_01-mask.fits' Image size .............................. : (1300, 1300) pixels Number of channels ...................... : 1 Number of Stokes parameters ............. : 1 Beam shape (major, minor, pos angle) .... : (2.45375e-02, 1.43596e-02, 175.0) degrees Frequency of image ...................... : 1409.822 MHz Number of blank pixels .................. : 0 (0.0%) --> Blanking pixels with values below 9.0e-01 Jy/beam Total number of blanked pixels .......... : 1689826 (100.0%) ERROR: A region with an unphysical rms value has been found. Please check the input image.

SpheMakh commented 7 years ago

Yes, I can add a divide and multiply options in FITSTOOL. Then we can just divide the input image by the mask, and avoid using the detection image and blank limit.

paoloserra commented 7 years ago

I was actually planning to use detection_image = input_image / clean_mask . Source parameters still need to be measured from the unmasked input image otherwise the model will be wrong.

paoloserra commented 7 years ago

@paoloserra , possibly because Pybdsm still needs to identify islands of emission in the detection image.

But then wouldn't it be better to use the input image rms? The rms from the detection_image will hardly ever have any relation to the actual image noise

KshitijT commented 7 years ago

But then wouldn't it be better to use the input image rms? The rms from the detection_image will hardly ever have any relation to the actual image noise

From what I understand the idea behind detection images is that they are images of the sky where we may reliably identify regions where there are sources present - we may actually want to do source detection in images where the rms might be worse. An example would be trying to find sources in a spectral cube by using the MFS image as the detection image (with better snr). We could make a compromise here by using the rms from the input image as the constant rms value for the detection image?

paoloserra commented 7 years ago

I understand. I thought PyBDSM would assume that all non-blank pixels in detection_image had been detected. I think I took the nake detection_image too literally.

Note, however, that we don't know the rms value of the input image when we start running PyBDSM -- unless we break the recipe in two parts, one before rms calculation and one after that (the latter would include PyBDSM).

paoloserra commented 5 years ago

I think we are now generally happy with how the pipeline cleans and selfcalibrates. Let's close it now and reopen if necessary.