simonsobs / sotodlib

Simons Observatory: Time-Ordered Data processing library.
MIT License
16 stars 19 forks source link

Unify preprocessing functions #1027

Closed mmccrackan closed 6 days ago

mmccrackan commented 1 week ago

This branch moves around some of the preprocessing functions into a common preprocessing/preprocess_util.py file. The ones moved specifically are:

Some of these aren't shared among multiple files yet but they will be when #1026 is merged. Note that the inputs of load_and_preprocess are slightly modified from load_preprocess_tod/load_preprocess_obs. It now accepts no_signal which defaults to None and the dets and meta inputs not in load_preprocess_obs are now added. The default config names were removed, though we could re-add them based on the value of no_signal perhaps.

Some additional files have also been moved into site_pipeline/util.py:

mmccrackan commented 1 week ago

Thanks @mmccrackan, some comments on things that are also done in #1014 which has not been merged yet. How should we rectify this? @msilvafe

It looks like the only major difference is that I didn't add the get_obslist function that you wrote. I can add that here since it looks useful for the multilayer_preprocessing. Since this pull request includes the preprocess_util function, I guess we can say that this branch supersedes #1014?

TODOs: We are also going to change things so that stuff in preprocess_util currently imported from site_pipeline util are either copied or moved into preprocess_util.

Finally, we will attempt to merge preprocess_tod, preprocess_obs, and multilayer_preprocess_tod into a single script.

mmccrackan commented 1 week ago

Just to comment on the latest pushes, these re-arrange the shared functions such that neither preprocess/preprocess_util.py or site_pipeline/util.py needs to import each other anymore. Both preprocess_obs.py and preprocess_tod.py include both of the util files. preprocess/preprocess_util.py and site_pipeline/util.py now have some copies of functions that are used elsewhere in site_pipeline or preprocess such as init_logger or ArchivePolicy.

Some of the functions were used in update_preprocess_plots.py so that has been updated to use the functions in preprocess/preprocess_util.py instead of site_pipeline/util.py

This approach should mostly maintain the hierarchy of preprocess functions being below site_pipeline.

msilvafe commented 1 week ago

Thanks @mmccrackan, some comments on things that are also done in #1014 which has not been merged yet. How should we rectify this? @msilvafe

It looks like the only major difference is that I didn't add the get_obslist function that you wrote. I can add that here since it looks useful for the multilayer_preprocessing. Since this pull request includes the preprocess_util function, I guess we can say that this branch supersedes #1014?

TODOs: We are also going to change things so that stuff in preprocess_util currently imported from site_pipeline util are either copied or moved into preprocess_util.

Finally, we will attempt to merge preprocess_tod, preprocess_obs, and multilayer_preprocess_tod into a single script.

Yes, I think this is a bit larger of an overhaul in preparation to merge preprocess_tod and preprocess_obs into one driving script, and we didn't really understand the overall repo philosophy about core -> main modules -> site pipeline only importing packages higher lower in the higherarchy and not higher so we need to pull everything out of site_pipeline.util so let's let this PR supersede #1014