jeff-regier / Celeste.jl

Scalable inference for a generative model of astronomical images
MIT License
183 stars 28 forks source link

RawPSF with bicubic interpolation for stars #708

Closed jeff-regier closed 6 years ago

jeff-regier commented 6 years ago

This PR brings the percent of stars misclassified as galaxies (missed_stars) down from ~31% to ~8% -- right in line with Photo. Apparently our Mixture of Gaussians PSF really was very bad. It's still in use for galaxies, but I can get rid of that in later work.

From master:

│ Row │ N   │ first     │ second    │ diff       │ diff_sd    │ field                         │
├─────┼─────┼───────────┼───────────┼────────────┼────────────┼───────────────────────────────┤
│ 1   │ 481 │ 0.0       │ 0.0       │ 0.0        │ 0.0        │ is_saturated                  │
│ 2   │ 128 │ 0.078125  │ 0.3125    │ -0.234375  │ 0.0405416  │ missed_stars                  │
│ 3   │ 353 │ 0.0396601 │ 0.0396601 │ 0.0        │ 0.00965877 │ missed_galaxies               │
│ 4   │ 481 │ 0.267953  │ 0.26437   │ 0.00358313 │ 0.00623836 │ position                      │
│ 5   │ 481 │ 0.179437  │ 0.177197  │ 0.00224041 │ 0.0100144  │ reference_band_flux_mag       │
│ 6   │ 481 │ 1.11384   │ 1.72663   │ -0.612792  │ 0.338765   │ reference_band_flux_nmgy      │
│ 7   │ 105 │ 16.9819   │ 14.979    │ 2.00284    │ 1.44225    │ angle_deg                     │
│ 8   │ 211 │ 0.260781  │ 0.187767  │ 0.0730141  │ 0.0210781  │ de_vaucouleurs_mixture_weight │
│ 9   │ 211 │ 0.199889  │ 0.144968  │ 0.0549212  │ 0.0103371  │ minor_major_axis_ratio        │
│ 10  │ 211 │ 1.29034   │ 0.614865  │ 0.675474   │ 0.331621   │ half_light_radius_px          │
│ 11  │ 370 │ 1.03077   │ 0.56432   │ 0.466447   │ 0.0499466  │ color_log_ratio_ug            │
│ 12  │ 472 │ 0.338688  │ 0.17356   │ 0.165128   │ 0.0203928  │ color_log_ratio_gr            │
│ 13  │ 480 │ 0.201699  │ 0.118033  │ 0.0836664  │ 0.00985499 │ color_log_ratio_ri            │
│ 14  │ 476 │ 0.387063  │ 0.180376  │ 0.206687   │ 0.0221786  │ color_log_ratio_iz            │

From jcr/interpolations (this PR):

14×6 DataFrames.DataFrame
│ Row │ N   │ first     │ second    │ diff        │ diff_sd    │ field                         │
├─────┼─────┼───────────┼───────────┼─────────────┼────────────┼───────────────────────────────┤
│ 1   │ 482 │ 0.0       │ 0.0       │ 0.0         │ 0.0        │ is_saturated                  │
│ 2   │ 129 │ 0.0775194 │ 0.0775194 │ 0.0         │ 0.0213178  │ missed_stars                  │
│ 3   │ 353 │ 0.0396601 │ 0.0509915 │ -0.0113314  │ 0.010402   │ missed_galaxies               │
│ 4   │ 482 │ 0.267632  │ 0.265964  │ 0.00166788  │ 0.00634124 │ position                      │
│ 5   │ 482 │ 0.179185  │ 0.187796  │ -0.00861089 │ 0.013311   │ reference_band_flux_mag       │
│ 6   │ 482 │ 1.13222   │ 2.14602   │ -1.01381    │ 0.483521   │ reference_band_flux_nmgy      │
│ 7   │ 105 │ 16.9819   │ 15.5799   │ 1.40192     │ 1.31291    │ angle_deg                     │
│ 8   │ 211 │ 0.260781  │ 0.187022  │ 0.0737584   │ 0.0215299  │ de_vaucouleurs_mixture_weight │
│ 9   │ 211 │ 0.199889  │ 0.145787  │ 0.0541026   │ 0.0103492  │ minor_major_axis_ratio        │
│ 10  │ 211 │ 1.29034   │ 0.612578  │ 0.677761    │ 0.331698   │ half_light_radius_px          │
│ 11  │ 371 │ 1.0283    │ 0.575659  │ 0.452643    │ 0.0498698  │ color_log_ratio_ug            │
│ 12  │ 473 │ 0.338041  │ 0.175434  │ 0.162607    │ 0.0204756  │ color_log_ratio_gr            │
│ 13  │ 481 │ 0.201293  │ 0.126392  │ 0.0749017   │ 0.0124566  │ color_log_ratio_ri            │
│ 14  │ 477 │ 0.386288  │ 0.189743  │ 0.196545    │ 0.0235967  │ color_log_ratio_iz            │
kbarbary commented 6 years ago

The PSFEx manual might be helpful here: http://psfex.readthedocs.io/en/latest/Working.html They use a Lanczos4 interpolation kernel.

jeff-regier commented 6 years ago

Unfortunately, I think Lanczos4 isn't twice differentiable---a condition our optimization procedure requires. At 4, it's continuous and differentiable, but there's a kink in the derivative at 4.

By oversampling the PSF enough, once we're learning the PSF ourselves, we should be able to make the difference btw bicubic and Lanczos negligible.

codecov[bot] commented 6 years ago

Codecov Report

Merging #708 into master will increase coverage by 0.11%. The diff coverage is 86.04%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #708      +/-   ##
==========================================
+ Coverage   80.74%   80.86%   +0.11%     
==========================================
  Files          35       35              
  Lines        3662     3673      +11     
==========================================
+ Hits         2957     2970      +13     
+ Misses        705      703       -2
Impacted Files Coverage Δ
src/model/psf_model.jl 100% <ø> (ø) :arrow_up:
src/model/light_source_model.jl 53.57% <ø> (ø) :arrow_up:
src/DeterministicVI.jl 92.3% <ø> (ø) :arrow_up:
src/model/fsm_util.jl 100% <100%> (ø) :arrow_up:
src/model/imaged_sources.jl 100% <100%> (ø) :arrow_up:
src/AccuracyBenchmark.jl 81.55% <100%> (+0.55%) :arrow_up:
src/deterministic_vi/elbo_objective.jl 99.43% <100%> (ø) :arrow_up:
src/model/log_prob.jl 90.57% <57.14%> (+1.44%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update cd90450...011f354. Read the comment docs.

kbarbary commented 6 years ago

I only scanned this PR, but am I right that this uses the SDSS PSF model (RawPSF) evaluated at the center of the image everywhere in the image for stars?

If so, can we do that for galaxies as well? It would simplify segregating the SDSS-specific code (which I'm working on now).

jeff-regier commented 6 years ago

That's basically right: this PR uses SDSS PSF model (RawPSF) now. It's evaluated at every SkyPatch though, not every image. A unique SkyPatch is created for every pair of a light source s and an image n shows that light source.

Yes, I'd very much like to do something analogous for galaxies, so we can at last delete Transform.jl and PSF.jl.

kbarbary commented 6 years ago

Ah, right. Well, that doesn't help in removing the RawPSF from Image which is what I'd really like to do. Sorry for distracting from this issue. Let's discuss more on slack or a different issue.

jeff-regier commented 6 years ago

@andymiller I updated your log_prob.jl code to use the new star model. It'll be interesting to see how MCMC performs with the changes. The uncertainty scores for VI aren't looking very good even with this PR, so there's still plenty of room for MCMC to improve on VI.