LSSTDESC / rail_base

Base classes for RAIL
MIT License
0 stars 1 forks source link

Degrader refactoring #33

Closed aimalz closed 3 months ago

aimalz commented 1 year ago

Related to LSSTDESC/rail_astro_tools#9, we're using the Degrader class for stages that are doing things more different than Estimator vs. Summarizer, and it's making it hard to generalize the degraders we have to be flexible to support next steps. There's also functionality embedded in degraders that's more generically useful, e.g. filter conversions, that folks want to call for informer/estimator pipelines. These issues are compounded by the confusion caused by operations on data tables to all be called "degraders" when many of them are not affecting any notion of quality; the name of the class was originally meant to be for a class that modified generative models and doesn't necessarily describe what the various stages are actually doing. This issue is for refactoring the degrader stages so users and developers better know what functionality is available and how to access or extend what they need. This will affect both rail_base and rail_astro_tools.

EDIT: Now with more specifics!

  1. Related to repeated code in spectroscopic_selections.py (one PR for this block):

    • [ ] make a ColorMaker function in rail.core within rail_base if possible
  2. In quantityCut.py (one PR for this block):

    • [ ] add a function that the degrader therein calls
    • [ ] then call that function in the spectroscpic_selections.py degraders
  3. spectroscopic_selections.py appears in both rail_astro_tools and rail_base (two extra lines in one of them), and it's not clear to outsiders what the difference is, in terms of where to look for which or to submit a contribution

    • [ ] consolidate these in rail_astro_tools so adding new surveys does not require editing rail_base
    • [ ] then move issue to rail_astro_tools
  4. In spectroscopic_selections.py (one PR for this block):

    • [ ] move SpecSelection to its own module in rail_base
    • [ ] __repr__ should contain the reference for the cuts
    • [ ] naming convention: [SURVEY]SpecSelection
    • [ ] separate color cuts from specsuccess (two parent stages: QuantityCut and SpecSuccessChecker) for each survey
    • [ ] un-hardcode per-survey values -- make a method of parent stage that ingests dictionary for cuts
    • [ ] un-hardcode column names for per-survey degraders to specify per-survey filter definitions
    • [ ] move config.seed to init rather than every method using a random number
  5. In grid_selection.py (one PR for this block):

    • [ ] separate out base stage class from HSC-specifics (same as #14)
    • [ ] move base class to its own module outside degradation directory
    • [ ] move the HSC-specific one to rail_astro_tools
    • [ ] implement tests (currently has none due to universal # pragma: no cover)
  6. Overall (as part of all these PRs???):

    • [ ] rename degradation to modifiers or something like that since scope has grown
    • [ ] rename modules for consistency with naming conventions
    • [ ] propagate to imports elsewhere in src/rail as needed
    • [ ] propagate changes through demos
aimalz commented 1 year ago

Note that we can take this opportunity to associate some notion of object ID with all catalogs (and add it to SHARED_PARAMS).

sschmidt23 commented 11 months ago

@aimalz during a tag-up call you said that there is something that breaks/isn't functioning in a specific degrader in the current set up and referenced this issue. Reading above, I don't see reference to any specific problem, can you point to the specific functionality that is missing/broken?

drewoldag commented 9 months ago

Requirements for V1

Copied from the original summary section.

High level goal, people should know what stages are what, and see what all the options are.

  1. In spectroscopic_selections.py (one PR for this block):

    • [ ] move SpecSelection to its own module in rail_base
    • [ ] __repr__ should contain the reference for the cuts
    • [ ] naming convention: [SURVEY]SpecSelection
    • [ ] separate color cuts from specsuccess (two parent stages: QuantityCut and SpecSuccessChecker) for each survey
    • [ ] un-hardcode per-survey values -- make a method of parent stage that ingests dictionary for cuts
    • [ ] un-hardcode column names for per-survey degraders to specify per-survey filter definitions
    • [ ] move config.seed to init rather than every method using a random number
  2. In grid_selection.py (one PR for this block):

    • [ ] separate out base stage class from HSC-specifics (same as #14)
    • [ ] move base class to its own module outside degradation directory
    • [ ] move the HSC-specific one to rail_astro_tools
    • [ ] implement tests (currently has none due to universal # pragma: no cover)
  3. Overall (as part of all these PRs???):

    • [ ] rename degradation to modifiers or something like that since scope has grown
    • [ ] rename modules for consistency with naming conventions
    • [ ] propagate to imports elsewhere in src/rail as needed
    • [ ] propagate changes through demos
sschmidt23 commented 9 months ago

Renaming degradation to modification still seems like overkill to me, I think a comment in the documentation stating that some degraders simply modify or transform the data rather than "degrade" it is sufficient.

eacharles commented 9 months ago

I agree with Sam here.

ztq1996 commented 4 months ago

Noisifier and Selector are established, and 7/8 degraders have moved away from the old degrader class. After the remaining spectroscopic selection is moved, I will close this issue.

ztq1996 commented 3 months ago

Spectroscopic selection is also refactored, close issue.