c3-time-domain / SeeChange

A time-domain data reduction pipeline (e.g., for handling images->lightcurves) for surveys like DECam and LS4
BSD 3-Clause "New" or "Revised" License
0 stars 4 forks source link

Deep Learning Score object #226

Open guynir42 opened 3 months ago

guynir42 commented 3 months ago

We need a separate model from Measurements that will contain the results of a ML/DL algorithm.

The object (which still has no name!) should contain a single float "score" and a link back to a Cutouts object. It has a provenance that contains a single parameter, which is the ML version. Maybe two parameters for the ML model (the name of the algorithm) and a semantic / hash version.

This is kept separate from the Measurements object (which contains photometry, and analytical cuts) because we may need to have several versions of the ML run on the same cutouts and provide the results side-by-side for the same Cutouts object.

This means that the data store should have room for multiple of these objects associated with each cutout, that potentially would need to be loaded at the same time, even though they have different provenances. Until now the pipeline only ever had one provenance for each process step. We need to think how to do this.

One option is to add something like a "thresholds" object that keeps track of the provenances that are relevant for the analysis, along with a single provenance for Measurements, and can load the relevant ML scores and analytical cuts. It would also have a set of thresholds for the different scores such that it can give a pass/no-pass verdict for each Cutouts object. We can decide later if we want to save or delete objects that did not pass.