LSSTDESC / rail_som

MIT License
0 stars 0 forks source link

SOM as an Estimator and Classifier #2

Open aimalz opened 1 year ago

aimalz commented 1 year ago

The simplesom and somoclu summarizers have "summarizer" in the name of their inform stages, but in theory SOM models ought to be usable for SOM-based estimators, too, and it's something we should probably add in the foreseeable future due to the popularity of SOMs for estimation as well as summarization. This issue is for defining an Inform\* stage that can work for a SOM performing estimation or summarization, along with writing a CatEstimator stage for at least one and preferably both SOM approaches, and modifying the various examples/pipelines to refer to the generic Inform\* stage. I've tagged @yanzastro and @sschmidt23 because you have the most experience with SOMs in RAIL, but this issue is in fact open to anyone.

We should also take this opportunity to address an inconsistency in the naming of the parent class for these two summarizers; they take a spec-z catalog and test set photometry, but the SZPZSummarizer class literally says they're taking a p(z) catalog and spec-z catalog. It doesn't look like this class actually comes up anywhere else so it should be straightforward to rename it CatSZSummarizer/SZCatSummarizer along the way. (I'm also unclear on why the spec-zs are needed by the summarizer after the model is created -- is it just to get the weights between the photometric catalog and the spectroscopic training set, in which case could that be carried as part of the saved model so these would just be CatSummarizers? Sorry if I'm totally misunderstanding how these work!)

aimalz commented 1 year ago

I edited the title to include classification because it would generalize the applicability of #1. If the classifier produces labels corresponding to SOM cells, then a stage for the quality control would perform different selections based on quantities derived from the galaxies divided into their SOM cells, and then the reduced overall sample could become input to any Summarizer.