Keck-DataReductionPipelines / KCWI_DRP

KCWI python DRP
BSD 3-Clause "New" or "Revised" License
8 stars 12 forks source link

Breaking up the DRP into stages (like the IDL version) #104

Open prusinski opened 2 years ago

prusinski commented 2 years ago

In the previous IDL version, we made regions files and median filtered the cubes at various stages along the pipeline (notably stage5sky and stage6cube). Is there an efficient way of breaking up the pipeline so that we can make changes to the files and continue without starting from the beginning and rerunning/overwriting all objects? I’ve clumsily solved this problem by breaking up the event table in the kcwi_pipeline.py file and running subsets, but I noticed there’s still a STAGE column in the kcwi.proc file - perhaps we could take advantage of this and have reduce_kcwi only run e.g. stages 1-5, and then stages 6-7 through the use of a flag or argument?

MNBrod commented 2 years ago

At the moment, the stages are a remnant of the initial port from IDL, and aren't used in the pipeline at all. Editing the event table is the ideal way to do what you are looking for.

I will look into adding a command line option to load an event table from a separate file, so that customized pipelines can be run without needing to edit the base source code, which should make your use-case a little more streamlined.

prusinski commented 2 years ago

That would be fantastic and appears to be the most efficient way to go for the moment - thank you!

prusinski commented 2 years ago

Hi @MNBrod,

Just following up on this possibility and wanted to ask a few related questions. What would be easiest, putting the abridged event table in a separate file and reading that in, or appending it to the end of the kcwi.cfg file (perhaps multiple for multiple stages)? I could see use cases for both, but with regards to issue #103 perhaps being able to supply the table in the same code block that's running might be useful too.

When breaking up the event table manually I've also noticed that there are natural dividing points which mostly correspond to the IDL stages; however, it appears that certain stages have "preconditions". For instance, I can start the pipeline with an *icube.fits file and run object_wavelengthcorr through object_flux_calibrate (essentially IDL stages 7-8), but I haven't figured out how to take an *icubed.fits file (IDL stage 7) and run object_make_invsens and/or object_flux_calibrate to get an *icubes.fits image. I get a warning that the data isn't processed enough. This also comes up earlier in the pipeline where I might want to just go from an *intf.fits file to an *intk.fits file but have to run most of the object processing routine (e.g. *int.fits and *intd.fits stages) even though it has already been completed earlier.

In general, it seems I can get the pipeline to stop where I want it to, but it's sometimes difficult to pick it right back up from that same location/stage. Is there a way to make this more streamlined, or perhaps somewhere I can look to know what prerequisites are required when I want to break the pipeline into individual events/stages? Ideally, it would be nice to mimic the original IDL stages if that's at all possible. On that note, I manually edit the kcwi.proc file (usually deleting the appropriate lines) to allow the reduction to proceed in a broken-up fashion - is there a way to proceed in stages without adjusting kcwi.proc (perhaps a more sophisticated version of the clobber flag)?