c3-time-domain / SeeChange

A time-domain data reduction pipeline (e.g., for handling images->lightcurves) for surveys like DECam and LS4
BSD 3-Clause "New" or "Revised" License
0 stars 4 forks source link

CutoutsFile and Measurements #302

Closed whohensee closed 4 days ago

whohensee commented 4 weeks ago

Change the structure of the cutouts and measurements such that we have a CutoutsFile that saves all the cutouts and takes up a single row in DB as FileOnDiskMixin, and Measurements which point to the CutoutsFile for relevant information.

See #217 and #272

whohensee commented 5 days ago

I made a slight change from what Guy and I discussed about the co_dict attribute of cutouts, but I tried to implement it in a way that seemed most intuitive to use. The main co_dict (which contains individual subdictionaries for cutout data under keys such as "source_index_0") is actually a class that inherits from dict, and when it is passed a key that it doesn't already have it will search on disk for that key. The end result is that the user of Cutouts can just use the co_dict attribute without worrying about whether or not the specific cutouts data is loaded, and a user of Measurements can just use the attributes (such as m.sub_data), and things will load if they exist on disk.

If I forgot a useful case, feel free to let me know.

guynir42 commented 5 days ago

Please also try to run the models/test_image.py::test_image_products_are_deleted test and see if it is faster now that we've moved from list to single object. If so, mark it not skipped.