Closed sweverett closed 2 years ago
This would be a great enhancement and ease both our workflow and the workflow of future users.
On Wed, Apr 27, 2022 at 12:03 PM Spencer Everett @.***> wrote:
It has been annoying to create lots of very similar configs by-hand, and to have each of us run subsets of various cluster (m,z)'s and realizations for the same run_name.
If we think of each pipeline run as a "job", we should create a JobsManager and ClusterJob class which handle all of the bookkeeping if passed a single yaml file with the following things:
- run_name
- base_dir (absolute top level directory of all clusters and realizations for a given run, e.g. 4 different (m,z) clusters w/ 3 realizations each)
- nfw_dir (top level directory of all NFW truth files, assumed to have same directory structure as the outputs)
- gs_base_config (mock_superbit_data.py config file that sets the simulation type, just without a few things like mass, redshift, seeds, etc.)
- mass_bins (list of unique cluster mass values)
- z_bins (list of unique cluster redshift values)
- realizations (either # of realizations or a list of realization values; this allows you to run say the first 5 realizations while i run 6-10)
The JobsManager should then:
- Create the necessary directory structure for all runs and outputs, like I posted
- Create the job-specific GalSim config for mock generation
- Create the job-specific pipeline config file
Then doing a full run should only require a tiny top-level script that distributes all the desired pipe jobs on our local HPC environment
Supersedes #28 https://github.com/superbit-collaboration/superbit-metacal/issues/28
— Reply to this email directly, view it on GitHub https://github.com/superbit-collaboration/superbit-metacal/issues/62, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADLXI3V65OONHJBTCQ7UB7LVHFQMFANCNFSM5UPTYXVA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
So there is currently a jobs.ClusterJob.make_gs_config()
method that updates a base GS config with realization-specific config options such as the GS master seed. I'm pretty sure this is where the stellar density is set as well in your current GAIA methods, so this should be a simple place to add this capability if we end up going down that path.
Completed by #63
It has been annoying to create lots of very similar configs by-hand, and to have each of us run subsets of various cluster (m,z)'s and realizations for the same
run_name
.If we think of each pipeline run as a "job", we should create a
JobsManager
andClusterJob
class which handle all of the bookkeeping if passed a single yaml file with the following things:run_name
base_dir
(absolute top level directory of all clusters and realizations for a given run, e.g. 4 different (m,z) clusters w/ 3 realizations each)nfw_dir
(top level directory of all NFW truth files, assumed to have same directory structure as the outputs)gs_base_config
(mock_superbit_data.py config file that sets the simulation type, just without a few things like mass, redshift, seeds, etc.)mass_bins
(list of unique cluster mass values)z_bins
(list of unique cluster redshift values)realizations
(either # of realizations or a list of realization values; this allows you to run say the first 5 realizations while i run 6-10)The
JobsManager
should then:Then doing a full run should only require a tiny top-level script that distributes all the desired pipe jobs on our local HPC environment
I will work on this in the existing
job-configs
branch. Supersedes #28