LSSTDESC / rail_attic

Redshift Assessment Infrastructure Layers
MIT License
14 stars 9 forks source link

Summarization subpackage baseline #56

Closed aimalz closed 2 years ago

aimalz commented 3 years ago
aimalz commented 3 years ago

@johannct made a good point about Summarizer subclasses that want posteriors output by Estimator subclasses instead of just input photometry from Creator subclasses. Rather than letting those summarizers have a method that calls an estimator, it should instead have keywords in its config file that specify what estimator it wants to run (and where to find its config file) and/or the path to the estimated posteriors it wants to read in. This would kind of be analogous to the estimator subclasses that take completely different inputs from one another. The advantage is that no estimators are hidden in summarizers.

aimalz commented 3 years ago

@AngusWright This would be a good opportunity to wrap your SOM_DIR code as a Summarizer() subclass, which would have the added bonus of being the first example of RAIL calling underlying R code.

aimalz commented 3 years ago

Maybe what makes sense is for a Summarizer() superclass to live in rail.estimation so something like The WiZZ or a SOM can inherit from both Summarizer() and Estimator().

joezuntz commented 3 years ago

I've been thinking about this a bit and two conclusions I've come to:

My main suggestion is that we get the code in place first before trying to push things under a single superclass.

aimalz commented 2 years ago

@joezuntz Just caught this while catching up on a GitHub notification backlog, sorry! I think we discussed this offline, but in case I'm mistaken, or just for completeness, I'll also reply to your above comment here.

The plan is now to restructure the rail.estimation subpackage to have a three-tier superclass/subclass structure in which there is an Estimator() superclass that contains the I/O interface and most general methods (i.e. the __init__, train, estimate, etc. steps you cited) with a pair of subclasses for estimating per-galaxy photo-z information (named something like PZEstimator()) and population-level redshift information (named something like NZEstimator()) that themselves have subclasses for what we currently think of as estimators and summarizers; the wrappers of methods that produce both per-galaxy and ensemble-level redshift information would then be able to inherit from both of the superclasses. This plan addresses your three points above:

But it totally makes sense to draft up some individual methods before solidifying a top-level structure! It seems to me like proto-wrapping the guts of TXPipe's DIR pipeline stage LSSTDESC/TXPipe#193 would be low-hanging fruit (along with the others listed under "Development of n(z) estimators" in the overall plan for the nz-band-sensitivity project LSSTDESC/nz-band-sensitivity#1), but perhaps I'm behind on the status of the standalone SOM summarizer.

eacharles commented 2 years ago

overtaken by events