pmelchior / scarlet

hyperspectral galaxy modeling and deblending
MIT License
50 stars 22 forks source link

Select bounding box for each peak #14

Closed pmelchior closed 6 years ago

pmelchior commented 7 years ago

Currently the model of each component covers all pixels in the image. That's overkill for the evaluation of the likelihood gradients. Instead one could work with a subset of pixels, loosely based on previously detected footprints.

One can think of a few ways of doing that: 1) Each component exists only in its box, so that constraints only have to check for that box (we'd have to insist that the models are zero outside of that bounding box otherwise each component can do whatever it pleases outside of its own box). For the all procedures that require the actual model, e.g. likelihood gradients, the components need to be translated accordingly. That means that S is internally always represented by a vector (not a matrix), which we already partially implement in the dot products of block-diagonal constraint matrices, and the generation of the model is more involved. The downside is that we need to build the constraint matrices separately for each component (or work with fixed box size increments to reduce that number). It seems to me that this will increase initialization time but reduce runtime. 2) Use a sparse matrix form for S that allows us to quickly update a set of predefined elements (for every component: those inside of the box, the rest is zero) as long as we don't change the structure of the matrix. The models would automatically have the right format, and the changes to the code would be minimal. There is probably a slightly larger runtime cost. This approach seems fairly straightforward to me, but I can't seem to find a way to do this within scipy.sparse.

In both cases, we'd automatically prevent the degeneracy between two remote objects having the same color.

fred3m commented 7 years ago

I have an idea about how to implement number 1, outlined in this ipython notebook, which uses a real HSC image to illustrate how this might work. It doesn't actually solve for the solution, but shows how one might go about choosing the bounding boxes for each object.

The basic idea is that the SED in neighboring pixels changes very little, and in fact differ very little from a peak except in blended regions. So we can mask out the image except for the region that encloses a peak, and then allow that region to grow by some reasonable factor to ensure that all of the flux is enclosed. This works well for the complicated blend I tested it on but more testing needs to be done to test the validity of the method. For example, some of the objects are given significantly larger regions because nearby sources have similar colors, but the important thing is that (at least in this example) none of the objects appear to be missing any flux. It would be interesting to see how this method works on spiral galaxies that have multiple colors.

pmelchior commented 7 years ago

That looks quite good already. I don't fully understand the logic that decides the size/shape of the bounding box. However, this is not really what this ticket is about. Even if we had bounding boxes, we wouldn't have a mechanism to use them.

pmelchior commented 7 years ago

Here's a different thought that should solve all of the problems we've had about separable objects run bSDMM with a extended set of variables [S1,S2,...,SK,A], where Si is the spatial part of i-th object. This way they can be treated independently (in whatever boxes we want to have) and the convergence can be checked for any of them separately. This last part should be very good for diagnostic purposes.

This requires a few changes. 1) deblender calls proxmin.bsdmm directly. 2) a new class for building the model and calculating likelihood gradients. 3) initialization of constraints for S needs to be split.

Second item will be most important. There are several redundant computations for the likelihood gradients. By pulling those together in a class, we can store the results of previous calculations and control when updates to internal variables (such as the fully assembled S matrix, or the A matrix) are done. It would also be the interface for the user to inspect and render the deblender outputs.

So, the action of the deblend method would be to take a list/tree of peaks (with constraints, i.e. a PeakCatalog) and to turn it into a list/tree of children with flags plus a mixing matrix. I think we can alter the currently half-useful Deblend class to incorporate the necessary functions (). I would however rename the class to Blend because it is the model of the blended scene. And that should be returned from a call to deblend.

Lastly, we already have a class Step_AS that acts in a similar way by storing A and S to have quicker estimates of the Lipschitz constant. We should fold that class in with Blend.

pmelchior commented 7 years ago

Another piece of logic for item 2), and how we deal with the boxes:

The question is: which of the elements do we need to store/update for the A update: A - 1/L_a (Y-APS) (PS)^T? Note that the convolved sources appear in the last term. Formally, all occurrence of S needs to be replace with S', so we'd have to compute the model twice per iteration. I think that the differences in gradient direction (with and without update of S) will be rather modest, at least after a few iterations. Given that the convolution operation to build the entire model is expensive, this suggest that we keep either PS or R fixed for the updates of S and A.

The simplest thing would this be to use the old S matrix for the A update. This is a deviation from the coordinate descent method, but we'd only have to build the model APS once.