legacysurvey / legacypipe

Image reduction pipeline for the DESI Legacy Imaging Surveys, using the Tractor framework
BSD 3-Clause "New" or "Revised" License
37 stars 22 forks source link

documentation on how we assemble the SGA model catalog #531

Closed moustakas closed 1 year ago

moustakas commented 4 years ago

This ticket is documentation, but I wanted others to be able to review it and comment (and to make sure that my thinking is clear!).

In production we will be using the "model" LSLGA catalog-- /global/cfs/cdirs/cosmo/staging/largegalaxies/v6.0/LSLGA-model-v6.0.kd.fits as opposed to the seed (Hyperleda+) catalog-- /global/cfs/cdirs/cosmo/staging/largegalaxies/v6.0/LSLGA-v6.0.kd.fits

Here is how I construct the model catalog:

  1. In parallel, iterate on every preburned galaxy group (which may contain one or more seed LSLGA galaxy). For each group, read the ellipse-fit ASDF output files. The ASDF files define the sample of pre-burned galaxies. In other words, if a galaxy fails ellipse-fitting (as defined in #529) it will not have updated geometry.

    There's a wrinkle here, though. Currently, some seed galaxies are rejected because they don't have grz coverage and some are rejected because they're not actual galaxies. We want to keep the former and reject the latter.

  2. If a Tractor catalog does not exist, continue. Currently, a catalog is not generated only if the group lacks grz coverage (e.g., on the edge of the footprint); for these galaxies we will use the original Hyperleda geometry.

  3. Read the Tractor catalog of all sources in a given mosaic. Immediately remove Gaia stars (ref_cat==G2), since these sources are forced in production and we don't want to double-count them.

  4. Initialize every source in the Tractor catalog with the boolean bit preburned=True and freeze=False. Also initialize every source with the non-Tractor columns d25, pa, ba, radius_sb23, radius_sb24, radius_sb25, radius_sb25.5, radius_sb26, mag_[grz]_sb23, mag_[grz]_sb24, mag_[grz]_sb25, mag_[grz]_sb25.5, and mag_[grz]_sb26, where the radius_* and mag_[grz]_* columns are the radius and magnitude at the given surface brightness limit (in mag/arcsec2), and mag_[grz]_tot, which is the extrapolated magnitude based on the curve-of-growth analysis. These columns are all initialized with the value -1.0 (which maybe isn't the best choice?).

  5. For each ellipse-fitted LSLGA galaxy in the mosaic, set freeze=True and populate the pa, ba, radius_*, mag_*, and mag_*_tot columns. Set d25 in this priority order: (1) radius_sb25, radius_sb24, and lslga_d25, where lslga_d25 is the original value from Hyperleda (necessary if the surface brightness profile is too shallow to reliably measure the radii based on the ellipse-fitting, although we should look into all these cases!.

    Next, identify all the Tractor sources within the newly-defined elliptical mask and set freeze=True for those sources (meaning no refitting of those sources in production).

  6. At this point we have a choice. Currently, we throw away all sources with freeze!=True, which means that in production Tractor will have to "rediscover" all the sources that are in the pre-burned mosaics but which are not LSLGA sources. We may want to relax this logic for DR9, for example, for finding smallish galaxies in bright-star masks, and for giving Tractor a leg-up toward the minimum solution (a "no galaxy left behind" ethos.)

  7. Finally, read the full (not group-divided) parent/seed LSLGA catalog; remove the LSLGA galaxies that we pre-burned, so we don't double-count them; remove galaxies that were rejected during the ellipse-fitting (presumably because they are not bona fide galaxies---we need to check); and then merge with the pre-burned catalog.

moustakas commented 4 years ago

@djschlegel @dstndstn @arjundey

I'd like to propose that we use the R(26) radius when we are able to measure it. So the priority order of the masking radii would be: R(26), the R(25), then R(25) [LSLGA], where the last value is from Hyperleda. Thoughts?

arjundey commented 4 years ago

Hi John

This sounds reasonable, but would it be possible for you to make figs of 3 typical galaxies (one exp, one dev and one weird) with these radii shown? It may be that we need to have some constant or multiplicative factor over one of these radii. For the galaxies in which R26 is measured, is the ratio of R26/R25 more or less constant (with perhaps the constant depending on whether the model is DEV or EXP)?

Arjun

================================================= Dr. Arjun Dey NSF’s National Optical-Infrared Astronomy Research Laboratory 950 N. Cherry Ave, Tucson, AZ 85719, USA Tel: 520-318-8429

On Apr 16, 2020, at 9:10 AM, Moustakas notifications@github.com wrote:

@djschlegel https://github.com/djschlegel @dstndstn https://github.com/dstndstn @arjundey https://github.com/arjundey I'd like to propose that we use the R(26) radius when we are able to measure it. So the priority order of the masking radii would be: R(26), the R(25), then R(25) [LSLGA], where the last value is from Hyperleda. Thoughts?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/legacysurvey/legacypipe/issues/531#issuecomment-614747518, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACWPUOCDSHWBFHZAJ3RTI43RM4UWJANCNFSM4MJAMKIQ.

moustakas commented 4 years ago

@arjundey here is the ratio of the R(26) and R(25) radii for the four different galaxy types. The dashed red line is a constant factor of 1.25 as a point of reference (the median ratios are 1.19, 1.32, 1.21, and 1.25 for REX, DEV, EXP, and SER, respectively.

Screen Shot 2020-04-21 at 2 09 38 PM
arjundey commented 4 years ago

Hi John,

This looks pretty good. If i eyeball the scatter it looks like ~0.1 or less in REX and EXP and perhaps 0.2-0.3 in DEV?

So is your proposal that we use R26 when we can, but if we cannot we would use R25 and multiply by either a constant or a “constant” that depends on profile type?

Thanks for looking into this.

Arjun

================================================= Dr. Arjun Dey NSF’s National Optical-Infrared Astronomy Research Laboratory 950 N. Cherry Ave, Tucson, AZ 85719, USA Tel: 520-318-8429

On Apr 21, 2020, at 11:12 AM, Moustakas notifications@github.com wrote:

@arjundey https://github.com/arjundey here is the ratio of the R(26) and R(25) radii for the four different galaxy types. The dashed red line is a constant factor of 1.25 as a point of reference (the median ratios are 1.19, 1.32, 1.21, and 1.25 for REX, DEV, EXP, and SER, respectively.

https://user-images.githubusercontent.com/1431820/79898731-bff70400-83d9-11ea-9dfa-83cc3768623f.png — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/legacysurvey/legacypipe/issues/531#issuecomment-617327696, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACWPUODS54DHUYHB4UYRRNDRNXOYRANCNFSM4MJAMKIQ.

moustakas commented 4 years ago

In dr9f, g I used radii in this priority order: (1) R(25)_ellipsefit; (2) R(24)_ellipsefit; (3) R(25)_Hyperleda. (For example, if the profile was too shallow to measure R(25) from my surface-brightness profile then I fell back to R(24); and if I couldn't measure the profile at all, I fell back to the Hyperleda value.)

So I think now I'm proposing to use: (1) R(26)_ellipsefit; (2) R(25)_ellipsefit; (3) R(24)_ellipsefit or R(25)_Hyperleda.

I wasn't thinking of using a constant scaling factor (e.g., to go from R(25) to R(26)) in the spirit of only using measurements where we have them. I understand that this choice introduces some heterogeneity, but I think it comes back to the purpose of these masking radii for DR9, which is to identify the "ellipse-of-influence" of each galaxy.

Having looked at a ton of ellipse fits I've found that R(26) looks like a "better" measure of that "ellipse-of-influence," and that where we can't measure R(26) that R(25) is a very sensible second choice.

I hope all this makes sense but we can discuss on our call, too.

moustakas commented 1 year ago

This was done.