jeff-regier / Celeste.jl

Scalable inference for a generative model of astronomical images
MIT License
183 stars 28 forks source link

Active pixels revamp #724

Closed kbarbary closed 6 years ago

kbarbary commented 6 years ago

Currently in Celeste there are three different, but unequal, indicators of the "region of influence" of an object (Here "region of influence" means the pixels that the object's light contributes to enough for us to care):

On top of this, the second measure (actual active pixel) is used in simulation and optimization, but the default method of determining active pixels is potentially biased and often does not select enough pixels. This introduces the need for a minimum radius of active pixels, further complicating things (see #688).

This PR simplifies the situation, essentially making the first measure (the size of the SkyPatch box) the One True Indicator. With this,

The creation of patches has been moved to an earlier stage of processing, alongside segmentation (object detection). This allows for an object's detected pixels to be used to determine the patch size, which is natural because we want the patch to contain the region of pixels above a noise threshold. Because the region is chosen before the creation of a catalog, SkyPatches are now constructed based on a pixel region rather than a catalog entry (e.g., SkyPatch(img, (1:10, 1:20)) rather than SkyPatch(img, ce).

The logic for choosing the box size goes as follows:

You can see this in code here. Currently, settings like the dilation factor and minimum box size are hard coded, but I plan to make them configurable in a future PR.

The resulting boxes look like this (same image as in #688): screenshot from 2017-11-30 21-54-05

Another example: screenshot from 2017-11-30 21-55-50

Finally, note that rectangles in pixel space are likely sub-optimal shapes for this purpose (the elliptical galaxy in the picture above being a good example). Our objects are often elliptical, so ellipses would be better. Fortunately, we can in the future change SkyPatch to represent an ellipse in pixel space rather than a rectangle without changing any of this logic or architecture. Details such as determining whether patches overlap or determining the patch from detected pixels would change, but everything else would stay the same.

Closes #688

kbarbary commented 6 years ago

It looks like upgrading DataFrames from 0.10 to 0.11 breaks Celeste on Travis (I have 0.10 locally but Travis has 0.11). The cause is that DataFrames 0.10 requires DataArrays 0.5 0.7 which provides the DataVector type. DataFrames 0.11 no longer requires DataArrays, and the DataVector type is no longer exported by DataFrames 0.11.

We should probably keep up-to-date by requiring DataFrames 0.11 and updating Celeste to use whatever the new column type is rather than DataVector. I will try to do that in this PR.

See https://discourse.julialang.org/t/dataframes-0-11-released/7296

jeff-regier commented 6 years ago

Fantastic! I really like that you identified and consolidated all three indicators of the region of influence. I've been concerned about disparities between the neighbors and the active pixels.

Getting rid of active pixels makes sense---a rectangle seems flexible enough to me. I suppose we could improve the runtime by ~50% if we switched to ellipses rather than rectangles, but that doesn't seem like a high priority for now.

kbarbary commented 6 years ago

that doesn't seem like a high priority for now.

:+1: Agreed.

Note that you'll have to upgrade to DataFrames 0.11 locally now. I had to remove Gadfly in order to get Pkg to upgrade DataFrames.