Model to component rendering and recovery

landmanbester commented 7 months ago

Just writing this up for posterity. It is possible to degrade the model when fiddling with optimization parameters or if something goes wrong and the deconvolution algorithm starts diverging. For this reason pfb keeps track of the model corresponding to the MFS residual with the minimum RMS and saves it as MODEL_BEST in the dds. Thus you should always be able to reset to the point the reduction which had the minimum RMS (I am open to other criteria that we may want to track). To do so, render the model to components using the model2comps worker eg.

$ pfb model2comps -o out/test -ldir out/logs --model-name MODEL_BEST --nbasisf 8

This will take the current best model and save it in a dataset which in this case would be called out/test_I_main_model_best.mds based in the default naming conventions (and I am just realizing that the default naming convention is a bit inflexible for this task and will provide an option to give an explicit output name). When --nbasisf is set to the number of bands the interpolation will be exact (i.e. no smoothing). The resulting model can be transferred during the grid step by setting

--transfer-model-from out/test_I_main_model_best.mds

It is also the expected model format of the fastim worker and there is an experimental branch of QuartiCal which can degrid this model on the fly. @IanHeywood hoping this will allow you to recover from the catastrophic divergence you experienced. We could conceive of doing this sort of backup more routinely (at every major iteration maybe?)

IanHeywood commented 7 months ago

This is brilliant. I was thinking of model2comps dumps at various stages in 'pipeline' mode so we can wind back if things go wrong, but I think this covers it. Can you foresee a scenario where something would go wrong and it would result in a very low RMS, thus storing a bogus model?

IanHeywood commented 7 months ago

The resulting model can be transferred during the grid step

Since the next thing on my to-do list is to provide increased --l2reweight-dof during the gridding stage and re-run spotless, do you recommend I also do this when restoring the model?

landmanbester commented 7 months ago

This is brilliant. I was thinking of model2comps dumps at various stages in 'pipeline' mode so we can wind back if things go wrong, but I think this covers it. Can you foresee a scenario where something would go wrong and it would result in a very low RMS, thus storing a bogus model?

If you set your thresholds (eg. rmsfactor or the stopping threshold for clean) too low this is exactly what would happen. This is why there is the --min-val option for the model2comps worker which will discard components below a certain value in the MFS model. This is not that well tested but its pretty simple so will hopefully just work

landmanbester commented 7 months ago

Since the next thing on my to-do list is to provide increased --l2reweight-dof during the gridding stage and re-run spotless, do you recommend I also do this when restoring the model?

This should make for prettier images but it may also result in flux suppression. Note that sensible values for --l2reweight-dof should be within 2 to 10 where 2 results in more aggressive down-weighting of outliers compared to a 10. I have successfully used a value of 5 in the past but this might require some experimentation. Note there is no harm in running the grid worker multiple times with different values.

Another reason for rendering the model to components before making restored images is to smooth the model a bit by making --nbasisf smaller than the number of bands. This way you can also interpolate your cubes to a higher resolution without redoing any deconvolution. In theory model subtraction should be better. The max resolution that you will be able to interpolate to in this way is set by the --channels-per-image parameter of the init worker. Apologies if this is a bit much to take in all at once

landmanbester commented 7 months ago

Just found a bug here viz. the fit will fall over with a LinAlgError if there are completely flagged bands and --nbasisf is set to nband. For now I'm just throwing an error and advising to decrease the number of basis functions leaving the user to figure out how many completely flagged bands there are. I should probably detect this and do something smarter

IanHeywood commented 7 months ago

I'm probably exposing my lack of understanding of something quite fundamental here, but I'm a bit confused about --nbasisf. The help string says this parameter is a polynomial order, but I'm struggling to see how lowering it will impart more spectral smoothness. I have 8 sub-bands and the default is 4, so if I lowered it to 2 wouldn't that result in a less faithful representation of the model?

If I've understood correctly then because I have set -cpi 64 in the init worker then the best I can do is 1024 / 64 = 16, i.e. a factor 2 improvement in the spectral resolution of the model over the -nband 8 sub-bands that I have been using so far. Is that correct?

As an aside, I'm assuming --nbasist is the time equivalent but the help string there also mentions frequency (and doesn't list the default).

landmanbester commented 7 months ago

I'm probably exposing my lack of understanding of something quite fundamental here, but I'm a bit confused about --nbasisf. The help string says this parameter is a polynomial order, but I'm struggling to see how lowering it will impart more spectral smoothness. I have 8 sub-bands and the default is 4, so if I lowered it to 2 wouldn't that result in a less faithful representation of the model?

I meant you will have to reduce it to less than nband (which results in perfect interpolation) if you want some spectral smoothing. If one of the bands are completely flagged then you will get a Singular Matrix error if you set --nbasisf to the number of bands and in that case you would need to reduce it. I've found that 4 is usually a good number.

f I've understood correctly then because I have set -cpi 64 in the init worker then the best I can do is 1024 / 64 = 16, i.e. a factor 2 improvement in the spectral resolution of the model over the -nband 8 sub-bands that I have been using so far. Is that correct?

Exactly. You can rerun the init worker with a smaller cpi without affecting anything else if you need to go finer. Overwriting is safe since it produces a different dataset than the grid worker.

As an aside, I'm assuming --nbasist is the time equivalent but the help string there also mentions frequency (and doesn't list the default).

Nice catch, I'll update the docstring. It defaults to the number of output times. The thinking here is that you may want to smooth/interpolate in time (eg. when imaging and subtracting the sun). This is a highly experimental and untested feature

IanHeywood commented 7 months ago

I don't think this is going to plan. 😅

INFO      15:58:15 - SPOTLESS           | Number of pixels per beam estimated as 20.52375148591553                                                  |
INFO      15:59:19 - SPOTLESS           | Iter 0: peak residual = 4.353e+36, rms = 1.190e+36                                                        |
INFO      15:59:19 - SPOTLESS           | Solving for model                                                                                         |
INFO      20:34:56 - PD                 | Max iters reached. eps = 1.048e-02                                                                        |
INFO      20:35:07 - SPOTLESS           | Getting residual                                                                                          |
INFO      20:43:02 - SPOTLESS           | Iter 1: peak residual = 1.500e+37, rms = 2.785e+36, eps = 1.000e+00                                       |
INFO      20:43:02 - SPOTLESS           | Computing L1 weights                                                                                      |
INFO      20:45:31 - SPOTLESS           | Updating results                                                                                          |
INFO      20:49:24 - SPOTLESS           | Solving for model                                                                                         |
INFO      00:46:31 - PD                 | Max iters reached. eps = 2.318e-02                                                                        |
INFO      00:46:43 - SPOTLESS           | Getting residual                                                                                          |
INFO      00:54:09 - SPOTLESS           | Iter 2: peak residual = 2.683e+38, rms = 5.043e+37, eps = 9.786e-01                                       |
INFO      00:54:09 - SPOTLESS           | Computing L1 weights                                                                                      |
INFO      00:56:39 - SPOTLESS           | Updating results                                                                                          |
INFO      00:58:12 - SPOTLESS           | Solving for model                                                                                         |
INFO      04:53:05 - PD                 | Max iters reached. eps = 2.738e-02                                                                        |
/mnt/home/ianh/venv/pfb/lib/python3.10/site-packages/numpy/core/_asarray.py:126: RuntimeWarning: overflow encountered in cast                       |
  arr = array(a, dtype=dtype, order=order, copy=False, subok=subok)                                                                                 |
INFO      04:53:17 - SPOTLESS           | Getting residual                                                                                          |
/mnt/home/ianh/venv/pfb/lib/python3.10/site-packages/numpy/core/_asarray.py:126: RuntimeWarning: overflow encountered in cast                       |
  arr = array(a, dtype=dtype, order=order, copy=False, subok=subok)                                                                                 |
INFO      05:00:20 - SPOTLESS           | Iter 3: peak residual = 4.454e+40, rms = 1.056e+40, eps = 9.980e-01                                       |
INFO      05:00:20 - SPOTLESS           | Computing L1 weights                                                                                      |
INFO      05:03:57 - SPOTLESS           | Updating results                                                                                          |
INFO      05:07:47 - SPOTLESS           | Solving for model

Data has blown up from the off following restoration of best model and resumption of spotless with the following.

pfb model2comps -o pfb_SGRA -ldir pfb_logs --model-name MODEL_BEST 

pfb grid -o pfb_SGRA -ntd 8 -nvt 4 -fov 3 --l2reweight-dof 5 -srf 2.5 -rob -1.0 -ldir pfb_logs --nband 8 --transfer-model-from pfb_SGRA_I_main_model_best.mds

pfb spotless -o pfb_SGRA -nb 8 -ntd 8 -nvt 4 -p I --rmsfactor 0.55 -niter 5 --l1reweight-from 1 -ldir pfb_logs

I'm guessing the l2reweight is the culprit.

landmanbester commented 7 months ago

I'm guessing the l2reweight is the culprit.

I doubt that. It's weird that it starts out with a peak residual of 4e36. The MFS model should get written out by the grid worker. Can please check if that looks sensible? I think you may also need to pass the --residual flag to the grid worker. I'm going to change that to be the default if a model is passed in

ratt-ru / pfb-imaging

Model to component rendering and recovery #95