kbarbary commented 6 years ago

This substitutes in infer_init_new(), which uses SEP to find sources for the initialization catalog, for infer_init() which used an SDSS catalog.

In the accuracy benchmarks, run_celeste_on_field.jl now takes no arguments and uses source detection by default. The option to initialize from a catalog is still there, with --initialization-catalog <filename>.

master

(first = primary, second = celeste prediction)

│ Row │ N   │ first     │ second    │ diff        │ diff_sd    │ field           │
├─────┼─────┼───────────┼───────────┼─────────────┼────────────┼─────────────────┤
│ 1   │ 128 │ 0.078125  │ 0.09375   │ -0.015625   │ 0.0238138  │ missed_stars    │
│ 2   │ 343 │ 0.0379009 │ 0.0291545 │ 0.00874636  │ 0.00864345 │ missed_galaxies │
│ 3   │ 471 │ 0.264242  │ 0.308173  │ -0.0439314  │ 0.0057081  │ position        │
│ 4   │ 380 │ 0.177765  │ 0.182055  │ -0.00429001 │ 0.0148959  │ flux_r_mag      │
│ 5   │ 380 │ 0.642483  │ 0.824231  │ -0.181748   │ 0.0801018  │ flux_r_nmgy     │
│ 6   │ 99  │ 16.2583   │ 15.685    │ 0.573306    │ 1.44051    │ gal_angle_deg   │
│ 7   │ 203 │ 0.255117  │ 0.221463  │ 0.0336534   │ 0.0222404  │ gal_frac_dev    │
│ 8   │ 203 │ 0.202949  │ 0.165704  │ 0.0372453   │ 0.0108118  │ gal_axis_ratio  │
│ 9   │ 203 │ 1.31983   │ 0.897505  │ 0.422324    │ 0.338132   │ gal_radius_px   │
│ 10  │ 362 │ 1.02056   │ 0.578394  │ 0.442169    │ 0.0510424  │ color_ug        │
│ 11  │ 462 │ 0.332076  │ 0.174087  │ 0.157989    │ 0.0191702  │ color_gr        │
│ 12  │ 470 │ 0.20101   │ 0.120696  │ 0.0803139   │ 0.0103564  │ color_ri        │
│ 13  │ 466 │ 0.379088  │ 0.183435  │ 0.195653    │ 0.0221637  │ color_iz        │

This branch

(first = primary, second = celeste prediction)

(note that I only ran about half the sources due to the long run time of the benchmarks at the moment)

│ Row │ N   │ first     │ second    │ diff        │ diff_sd    │ field           │
├─────┼─────┼───────────┼───────────┼─────────────┼────────────┼─────────────────┤
│ 1   │ 130 │ 0.0769231 │ 0.0846154 │ -0.00769231 │ 0.0223499  │ missed_stars    │
│ 2   │ 348 │ 0.0373563 │ 0.0287356 │ 0.00862069  │ 0.0101801  │ missed_galaxies │
│ 3   │ 478 │ 0.265789  │ 0.270879  │ -0.00509    │ 0.00660327 │ position        │
│ 4   │ 381 │ 0.179465  │ 0.149103  │ 0.0303624   │ 0.0116654  │ flux_r_mag      │
│ 5   │ 381 │ 0.658841  │ 0.610068  │ 0.0487726   │ 0.0395769  │ flux_r_nmgy     │
│ 6   │ 103 │ 16.9512   │ 14.3415   │ 2.60965     │ 1.42288    │ gal_angle_deg   │
│ 7   │ 207 │ 0.261412  │ 0.175869  │ 0.0855428   │ 0.0223267  │ gal_frac_dev    │
│ 8   │ 207 │ 0.197599  │ 0.146603  │ 0.0509962   │ 0.00986572 │ gal_axis_ratio  │
│ 9   │ 207 │ 1.28067   │ 0.650054  │ 0.630618    │ 0.336502   │ gal_radius_px   │
│ 10  │ 366 │ 1.03162   │ 0.592062  │ 0.439559    │ 0.0521393  │ color_ug        │
│ 11  │ 469 │ 0.338734  │ 0.174291  │ 0.164443    │ 0.0205884  │ color_gr        │
│ 12  │ 477 │ 0.192402  │ 0.114165  │ 0.078237    │ 0.00909477 │ color_ri        │
│ 13  │ 473 │ 0.376071  │ 0.177662  │ 0.198409    │ 0.0216543  │ color_iz        │

Future work: SEP definitely picks up lots of bright and saturated stars that are already masked in the SDSS catalog. We'll want some heuristic to mask these. They should not affect benchmarks though, as only sources in both the coadd and prediction catalogs are included in benchmarks.

Closes #157.

jeff-regier commented 6 years ago

Are you using multiple threads from running the accuracy benchmarks? It still finished for me in < 10 minutes, with 4 threads. It would be helpful to see the scores for all the sources on this branch. Partly, to make sure the N column remains about the same. Also hard to compare if master has scores for all sources, and this branch just has half.

kbarbary commented 6 years ago

Yeah makes sense about the N column. How do I select number of threads? Environment variable? Looking at CPU utilization, it seems like I'm using 2 out of 4 cores. When I ran on master, it took multiple hours. I'll try again though.

jeff-regier commented 6 years ago

Yep, environment variable:

export JULIA_NUM_THREADS=4

jeff-regier commented 6 years ago

And also set

export OMP_NUM_THREADS=1

since you want all the parallelism to be from the Julia threads, not from OMP threads.

jeff-regier commented 6 years ago

Would you test the accuracy benchmarks with synthetic imagery too, comparing master to this branch? It's more reliable in some sense than the stripe 82 scores because we have real ground truth for it.

There's not much to it, just two steps, and then it's the same as running the accuracy benchmarks with real data:

To generate the ground truth catalog (synthetic)

$ ./write_ground_truth_catalog_csv.jl prior

To generate the image:

$ generate_synthetic_field.jl <ground truth CSV>

codecov[bot] commented 6 years ago

Codecov Report

Merging #719 into master will increase coverage by 1.02%. The diff coverage is 100%.

@@            Coverage Diff             @@
##           master     #719      +/-   ##
==========================================
+ Coverage   80.11%   81.13%   +1.02%     
==========================================
  Files          37       37              
  Lines        4013     3987      -26     
==========================================
+ Hits         3215     3235      +20     
+ Misses        798      752      -46

Impacted Files	Coverage Δ
src/AccuracyBenchmark.jl	`80.64% <ø> (-0.21%)`	:arrow_down:
src/GalsimBenchmark.jl	`100% <100%> (ø)`	:arrow_up:
src/ParallelRun.jl	`94.09% <100%> (+9.74%)`	:arrow_up:
src/joint_infer.jl	`76.8% <0%> (+3.2%)`	:arrow_up:
src/mcmc/mcmc_misc.jl	`53.88% <0%> (+5.69%)`	:arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update b4721f9...4950a96. Read the comment docs.

kbarbary commented 6 years ago

Stripe 82 above updated with full results.

Synthetic results:

Master

│ Row │ N   │ first     │ field           │
├─────┼─────┼───────────┼─────────────────┤
│ 1   │ 131 │ 0.10687   │ missed_stars    │
│ 2   │ 340 │ 0.0294118 │ missed_galaxies │
│ 3   │ 471 │ 0.163471  │ position        │
│ 4   │ 471 │ 0.18367   │ flux_r_mag      │
│ 5   │ 471 │ 1.33026   │ flux_r_nmgy     │
│ 6   │ 48  │ 8.39943   │ gal_angle_deg   │
│ 7   │ 91  │ 0.191313  │ gal_frac_dev    │
│ 8   │ 91  │ 0.141528  │ gal_axis_ratio  │
│ 9   │ 91  │ 0.71273   │ gal_radius_px   │
│ 10  │ 471 │ 0.432434  │ color_ug        │
│ 11  │ 471 │ 0.169713  │ color_gr        │
│ 12  │ 471 │ 0.128667  │ color_ri        │
│ 13  │ 471 │ 0.198651  │ color_iz        │

This branch

│ Row │ N   │ first      │ field           │
├─────┼─────┼────────────┼─────────────────┤
│ 1   │ 122 │ 0.0491803  │ missed_stars    │
│ 2   │ 301 │ 0.00664452 │ missed_galaxies │
│ 3   │ 423 │ 0.10236    │ position        │
│ 4   │ 423 │ 0.118397   │ flux_r_mag      │
│ 5   │ 423 │ 0.338704   │ flux_r_nmgy     │
│ 6   │ 44  │ 6.27761    │ gal_angle_deg   │
│ 7   │ 79  │ 0.200732   │ gal_frac_dev    │
│ 8   │ 79  │ 0.117555   │ gal_axis_ratio  │
│ 9   │ 79  │ 0.523512   │ gal_radius_px   │
│ 10  │ 423 │ 0.336727   │ color_ug        │
│ 11  │ 423 │ 0.139262   │ color_gr        │
│ 12  │ 423 │ 0.109304   │ color_ri        │
│ 13  │ 423 │ 0.162469   │ color_iz        │

Without looking at the data, my guess here is that we're missing 10% of the faintest sources (buried in the noise) with this branch, and doing better on most benchmarks as a result of their absence in the comparison.

jeff-regier commented 6 years ago

That makes sense. Is there a setting we can change to lower the 10% "false negative" rate as much as we want, if we're willing to tolerate more false positives?

On the real data, how does SEP compare to Primary, in terms of false positives and false negatives? If you run score_predictions.jl twice, once to score the primary predictions and once to score the SEP predictions, is the N column about the same (or better) for SEP?

kbarbary commented 6 years ago

That makes sense. Is there a setting we can change to lower the 10% "false negative" rate as much as we want, if we're willing to tolerate more false positives?

Yes, it is currently hard-coded to something that seems like a reasonable trade-off, but I plan to make it configurable as part of a larger configuration refactor. It is the 1.3 in this line

        sep_catalog = SEP.extract(calpixels, 1.3; noise=SEP.global_rms(bkg))

On the real data, how does SEP compare to Primary, in terms of false positives and false negatives? If you run score_predictions.jl twice, once to score the primary predictions and once to score the SEP predictions, is the N column about the same (or better) for SEP?

Primary vs coadd

│ Row │ N   │ first     │ field           │
├─────┼─────┼───────────┼─────────────────┤
│ 1   │ 130 │ 0.0769231 │ missed_stars    │
│ 2   │ 357 │ 0.0392157 │ missed_galaxies │
│ 3   │ 487 │ 0.268329  │ position        │
│ 4   │ 487 │ 0.181464  │ flux_r_mag      │
│ 5   │ 487 │ 1.1297    │ flux_r_nmgy     │
│ 6   │ 106 │ 17.1374   │ gal_angle_deg   │
│ 7   │ 213 │ 0.263027  │ gal_frac_dev    │
│ 8   │ 213 │ 0.201823  │ gal_axis_ratio  │
│ 9   │ 213 │ 1.30143   │ gal_radius_px   │
│ 10  │ 375 │ 1.02586   │ color_ug        │
│ 11  │ 478 │ 0.340056  │ color_gr        │
│ 12  │ 486 │ 0.203456  │ color_ri        │
│ 13  │ 482 │ 0.389609  │ color_iz        │

Predictions (this branch) vs coadd

│ Row │ N   │ first     │ field           │
├─────┼─────┼───────────┼─────────────────┤
│ 1   │ 156 │ 0.121795  │ missed_stars    │
│ 2   │ 479 │ 0.0417537 │ missed_galaxies │
│ 3   │ 635 │ 0.292428  │ position        │
│ 4   │ 475 │ 0.170777  │ flux_r_mag      │
│ 5   │ 475 │ 1.68524   │ flux_r_nmgy     │
│ 6   │ 158 │ 17.8022   │ gal_angle_deg   │
│ 7   │ 292 │ 0.2123    │ gal_frac_dev    │
│ 8   │ 292 │ 0.179954  │ gal_axis_ratio  │
│ 9   │ 292 │ 0.832056  │ gal_radius_px   │
│ 10  │ 490 │ 0.693602  │ color_ug        │
│ 11  │ 627 │ 0.222191  │ color_gr        │
│ 12  │ 634 │ 0.163294  │ color_ri        │
│ 13  │ 632 │ 0.224448  │ color_iz        │

jeff-regier commented 6 years ago

Great!

kbarbary commented 6 years ago

I'll update the wiki page shortly.

jeff-regier / Celeste.jl

initialize via source detection rather than SDSS catalog #719

master

This branch

Codecov Report

Master

This branch

Primary vs coadd

Predictions (this branch) vs coadd