Keck-DataReductionPipelines / KPF-Pipeline

KPF-Pipeline
https://kpf-pipeline.readthedocs.io/en/latest/
Other
11 stars 2 forks source link

qlp_parallel.py #824

Closed awhoward closed 6 months ago

awhoward commented 7 months ago

The doc string for qlp_parallel.py is below. It uses the parallel utility and dynamic substitution into a template config file to parallelize QLP data processing of L0/2D/L1/L2 files. This is much faster and more flexible than the previous parallelization involving the --watch flag or the quicklook_match.recipe approach.

This is also a partial solution to the problem of the L1 QLP crashing as it makes the L1 QLP much easier to generate missing QLP data quickly. The next step is fixing the underlying issue.

"""

Script Name: qlp_parallel.py

Description:
  This script uses the 'parallel' utility to execute the recipe called 
  'recipes/quicklook_match.recipe' to generate standard Quicklook data 
  products.  The script selects all KPF files based on their
  type (L0/2D/L1/L2) from the standard data directory using a date range
  specified by the parameters start_date and end_date.  L0 files are 
  included if the --l0 flag is set or none of the --l0, --2d, --l1, --l2
  flags are set (in which case all data types are included).  The --2d, 
  --l1, and --l2 flags have similar functions.  The script assumes that it
  is being run in Docker and will return with an error message if not. 
  If start_date is later than end_date, the arguments will be reversed 
  and the files with later dates will be processed first.
  Invoking the --print_files flag causes the script to print the file
  names, but not compute Quicklook data products.

Options:
  --help         Display this message
  --start_date   Start date as YYYYMMDD, YYYYMMDD.SSSSS, or YYYYMMDD.SSSSS.SS
  --end_date     End date as YYYYMMDD, YYYYMMDD.SSSSS, or YYYYMMDD.SSSSS.SS
  --ncpu         Number of cores used for parallel processing; default=10
  --l0           Select all L0 files in date range
  --2d           Select all 2D files in date range
  --l1           Select all L1 files in date range
  --l2           Select all L2 files in date range
  --print_files  Display file names matching criteria, but don't generate Quicklook plots

Usage:
  python qlp_parallel.py YYYYMMDD.SSSSS YYYYMMDD.SSSSS --ncpu NCPU --l0 --2d --l1 --l2

Example:
  ./scripts/qlp_parallel.py 20230101.12345.67 20230101.17 --ncpu 50 --l0 --2d

"""