Spectrum initialization incorrect for strongly masked data

pmelchior commented 3 years ago

As reported by #245, non-sensical spectra arise from fitting data with strong masking, i.e. when sources are initialized in regions where one or more bands are largely unavailable. The counter-intuitive behavior is that the amplitude is very high in the masked bands, and essentially zero in the unmasked bands.

Two aspects contribute to this problem:

The initialization with the set_spectra_to_match method solves a linear problem that becomes ill-defined it a morphology vector has no (or almost no) overlap with the region of non-zero weights. If convolutions are in the forward path, this statement applies to the convolved morphology. In both cases, this yields a numerically unstable matrix inverse whenever the bulk of the source is in the masked region of a given band. The solver finds the spectrum amplitude for all components in each band, which means that a failure of one component in a band ruins the amplitude estimate for most/all components in that band (but other bands are unaffected). This is likely the cause for the counter-intuitive behavior. aa4ccd2 fixes this problem by only inverting the matrix for the sub-block of components that have sufficient non-zero product of morphology*weight.
The gradients used during the fit are quite volatile, which can lead to unexpected trajectories in parameter space. Here's an example of a ExtendedSource fit to data of a white source where the lower half of the reddest of 5 bands has weights=0:

https://user-images.githubusercontent.com/1463403/124305843-a13ee200-db33-11eb-9b37-9ca9067b6169.mp4

After the initialization (which underestimates the amplitude in the reddest band and thus comes out in cyan), there's a strong red overcompensation. That's a really bad fit, so the model gets suppressed globally. When it returns it avoids the 5-band region in the top of the image. Because it has lower flux there, it cranks up the red amplitude to get the right intensity in that band (the bottom region will not object because it doesn't care about the red amplitude). In this noise-free data the problem eventually gets corrected but it takes a large number of iterations because there's no strong incentive in terms of the log-likelihood of the fit. This problem can only really be avoided by introducing more information to the model in terms of parameter constraints/priors. For instance, fitting the same data with a PointSource (or probably any other parameterized model) has no trouble getting the correct flat (=white) spectrum. Priors in the spectrum shape would likely help as well.

The fits in #245 appear to use point-source models, so the fix for problem 1 above should be all that is needed. I write down problem 2 because it's pretty counter-intuitive, but not a bug in the code itself. It's a problem for the optimizer to find the correct values when any change in the non-parametric morphology yields a different mix of masked pixels in various bands, and any change of the spectrum will lead to changes in the morphology...

For completeness, the other two commit fix corner cases that I have triggered during testing the stuff above.

pmelchior commented 3 years ago

The solver runs independently in each band.

fred3m commented 3 years ago

Yup, you're right. Not sure what I was thinking. 🤷

pmelchior / scarlet

Spectrum initialization incorrect for strongly masked data #247