Allow modeling of interferometric images

cschreib commented 7 years ago

Interferometric images (in particular I am thinking of images produced by ALMA or JVLA) have a peculiar point spread function, with a peak of unity and sum of zero. imfit always re-normalizes the input PSF prior to the fit, dividing each pixel by the sum of all pixels of the PSF, which obviously causes the fit to fail when an interferometric PSF is provided in input. In this situation the PSF should not be re-normalized, instead it should be used as provided.

Could you add an option to support this? Right now I have simply commented out the re-normalization in convolver.cpp, which allows the fit to proceed.

NB: Another particularity of these images is that the noise is purely Gaussian, but it is also strongly correlated between neighboring pixels. In practice this means the uncertainties tend to be underestimated if this correlation is ignored. I don't know what is the best way to solve this from the point of view of the likelihood, but this is another issue.

perwin commented 7 years ago

Interesting! Not something I knew...

Sure, adding an option for that should be no problem, and I can schedule that for the next release (1.5, probably in a month or two). I can also let you know when the Github repo has an update allowing that, if you'd be interesting in testing it... (Maybe in about a week, since it seems like it should be simple to implement and the pressure will be less later this week, after the HST phase 2 proposals are due.)

perwin commented 7 years ago

As for the correlated noise -- yeah, that's a tricky one. In principle you can have it in optical/IR images once you start interpolating or regridding them, so it's not just a problem for interferometric images. Unfortunately, I don't know of any easy way to address it...

cschreib commented 7 years ago

Thanks for the reply!

Regarding correlated noise. I have looked into it some time ago but never ended up implementing anything. Apparently it could be quite simple to implement in a likelihood function (or chi2), at least for the case of Gaussian noise. See this PDF (bottom of page 6 onwards). For uncorrelated Gaussian noise we have:

logL ~ sum_{i} (p_i - m_i)^2/e_i^2

where p_i is the pixel value, e_i is the associated uncertainty and m_i is the value predicted by the model. When noise is correlated, the more general form is:

logL ~ sum_{i,j} (p_i - m_i) C_{i,j} (p_j - m_j)

where C_{i,j} is the inverse of the covariance matrix S. This matrix can be written as S = E K E where E_{i,i} = e_i and zero if i != j, and K_{i,j} is the correlation matrix. If the correlation can be described as the convolution with a kernel (which I think is very often the case), then K_{i,j} = k(x_i - x_j, y_i - y_j), where k is the convolution kernel and x_i and y_i are the image coordinates of the pixel i.

For uncorrelated noise, K_{i,i} = 1 and zero if i != j, so C_{i,i} = 1/e_i^2 and zero for i != j, and you recover the initial formulation of the likelihood for uncorrelated noise.

Given that S = E K E, then C = S^{-1} = E^{-1} K^{-1} E^{-1}, so we can rewrite the likelihood as:

logL ~ sum_{i,j} [(p_i - m_i)/e_i] R_{i,j} [(p_j - m_j)/e_j]

where R = K^{-1}.

The problem I envision is that inverting K may not be trivial. In the general case where one is fitting a NxM image with a known correlation kernel, this would require inverting a (NxM)^2 matrix!

In practice there may be a way out since: a) K takes it's values in the convolution kernel, so the value of the matrix between two points i and j only depends on x_i - x_j and y_i - y_j (i.e., the covariance matrix is invariant through translations), and b) the correlation only applies to groups of neighboring pixels, so one can assume (or derive from the kernel) that the correlation is zero beyond a threshold distance. These two properties should make it easier to derive the values of R_{i,j}, but so far I haven't found how...

perwin commented 7 years ago

I figured it would involve some kind of covariance matrix, but it's nice to see it explicitly spelled out like that.

I suspect the issue might be even worse for a lot of cases, given that I doubt the correlation kernel is explicitly available for things like the output of SWarp. (I've seen some general discussion of correlated noise in terms of the Drizzle package, but I confess that I don't follow most of it, and it seems to be oriented towards the idea of aperture measurements...)

Still, I'd be happy to hear about it if you make progress...

perwin commented 7 years ago

As for the coding issue of turning off PSF normalization in Imfit: I've got a preliminary implementation in the current version of the code, and I'd very much like to know if it does what you want (and if you have any suggestions for improvement). You can retrieve the current code from Github and compile it yourself; alternately, I could compile a version (Linux? macOS?) and put it on a temporary folder on my web site for you to download.

(The basic idea is that there is now a "--no-normalize" flag which will turn off PSF normalization, both for the main PSF image and for any oversampled PSFs that might also be supplied.)

cschreib commented 7 years ago

I'll give a try to the new option tomorrow; don't worry generating a binary, I can compile the latest github version.

If the correlation kernel is not known, it can be estimated from the data itself if a sufficient portion of the image contains just sky (non-optimized pseudo-code):

img = load_image("image.fits") // the image
msk = load_image("mask.fits") // a mask (0: source, 1: sky)

mean = 0
mean_cnt = 0
for (x1=0 to img.nx-1 and for y1=0 to img.ny-1) {
    if (msk[x1,y1] == 0) continue

    mean += img[x1,y1]
    mean_cnt += 1
}

mean /= mean_cnt

nk = 10
kernel = image(2*nk+1, 2*nk+1)
kernel_cnt = image(2*nk+1, 2*nk+1)
for (x1=0 to img.nx-1 and for y1=0 to img.ny-1) {
    if (msk[x1,y1] == 0) continue

    for x2=0 to img.nx-1 and for y2=0 to img.ny-1 {
        if (msk[x2,y2] == 0) continue
        if (x1 - x2 < -nk or x1 - x2 > nk or y1 - y2 < -nk or y1 - y2 > kn) continue
        kernel[x1-x2+nk,y1-y2+nk] += (img[x1,y1]-mean)*(img[x2,y2]-mean)
        kernel_cnt[x1-x2+nk,y1-y2+nk] += 1
    }
}

kernel /= kernel_cnt

This effectively measures the covariance between pixels. To get the kernel, you simply divide by the variance of the image.

perwin commented 7 years ago

Interesting... I guess the idea is that you assume this applies to the source-dominated areas just as much to the sky-dominated areas?

(It occurs to me that "inverting the kernel" would in principle be something done just once for a given image, and so the computational overhead wouldn't really affect the fitting, or MCMC, processes.)

perwin commented 7 years ago

Any progress on testing the revised code with your PSFs?

perwin commented 7 years ago

The code is mostly ready for a new (version 1.5) release, so I was hoping to find out if the PSF-normalization option worked for you...

cschreib commented 7 years ago

Sorry, I got dragged into other business... I've just compiled the master branch and tested your --no-normalize option: it works perfectly. The results are identical to what I obtained before I tweaked the source code to disable the normalization manually. Thanks!

NB: This option can be useful also in other contexts than interferometry. For example I have seen a number of cases where a PSF was not normalized to unit integral because it was truncated after some radius. To get correct flux measurements, the integral should only be equal to the fraction of the flux of the PSF enclosed within this radius, hence the PSF should not be automatically renormalized.

perwin commented 7 years ago

Great! I'll go ahead and close this issue (though if you have any more thoughts on how to implement correlated errors, feel free to open another issue devoted to that).

perwin / imfit

Allow modeling of interferometric images #3