umd-lhcb / lhcb-ntuples-gen

ntuples generation with DaVinci and in-house offline components
BSD 2-Clause "Simplified" License
1 stars 0 forks source link

Cleanups for workflows folder #85

Closed yipengsun closed 2 years ago

yipengsun commented 2 years ago

Currently, our workflow folder looks like this:

.
├── make_step2_ntuples.py
├── rdx
│   ├── data_no_mu_bdt.sh
│   ├── data.sh
│   ├── mc.sh
│   ├── rdx-run1.yml
│   ├── rdx-run2.yml
│   ├── trigger_emulation_fs_vs_to.sh
│   └── trigger_emulation.sh
├── rdx_cutflows.py
├── rdx.py
└── utils.py

I think it is indeed getting unwieldy. I proposed the following change to trying to make each script self-contained while maintain certain sense of modularity:

.
├── make_step2_ntuples.py  # Put common functions in 'utils.py`, then rename it with `rdx.py`
├── rdx  # Remove this folder entirely
│   ├── data_no_mu_bdt.sh
│   ├── data.sh
│   ├── mc.sh
│   ├── rdx-run1.yml
│   ├── rdx-run2.yml
│   ├── trigger_emulation_fs_vs_to.sh
│   └── trigger_emulation.sh
├── rdx_cutflows.py  
├── rdx.py  # Largely replace this with 'make_step2_ntuples.py'
└── utils.py

After said change, rdx.py should have no unknown dependency other than utils.py, which should be relatively clear.

I don't like the idea to have multiple ways to do the same thing, so let's see if we can agree on a sensible middle point. Still, this will take some work, so before I start actually working on it, I'd like to hear opinions on this from @manuelfs