Open heather999 opened 5 years ago
Example call from Antonio a month or so ago:
https://hub.docker.com/r/avillarreal/alcf_run2.0i - has containers, with Run2.1.1i-20190924test as the latest tag.
aprun -n 1 -d 64 -j 1 singularity exec -H /lus/theta-fs0/projects/LSSTsky/Run2.1.1i -B /projects/LSSTsky:/projects/LSSTsky:rw /projects/LSSTsky/Run2.1.1i/run-test20190924/alcf_run2.0i_Run2.1.1i-20190924test.sif /projects/LSSTsky/Run2.1.1i/ALCF_1.2i/docker_run.sh python /projects/LSSTsky/Run2.1.1i/ALCF_1.2i/scripts/run_imsim.py --workdir /projects/LSSTsky/Run2.1.1i/run-test20190924/run/outputs/00445379to00497969/00479028/ --outdir /projects/LSSTsky/Run2.1.1i/run-test20190924/run/outputs/00445379to00497969/00479028/ --file_id 479028 --processes 64 --bundle_lists /projects/LSSTsky/Run2.1.1i/run-test20190924/parsl-auto-bundles.json --node_id node0 --visit_index 0 --ckpt_archive_dir /projects/LSSTsky/Run2.1.1i/run-test20190924/run/outputs/00445379to00497969/00479028/agn_ckpts/ --config /projects/LSSTsky/Run2.1.1i/ALCF_1.2i/parsl_imsim_configs
This is an example aprun command that fires off a 64 thread job using our bundling set-up. You could also pass it a list of specific sensors if you wanted instead of a bundle json + node id + visit index, which I believe is what you’re already doing. See on Slack: https://lsstc.slack.com/archives/CJ50YVDD3/p1570546110008700
@villarrealas could you post an updated version with the using the latest version of the docker image?
aprun -n 1 -d 64 -j 1 singularity exec -H /lus/theta-fs0/projects/LSSTsky/Run2.1.1i -B /projects/LSSTsky:/projects/LSSTsky:rw /projects/LSSTsky/Run2.1.1i/run-test20190924/dc2-imsim_Run2.2i-validation-v1.sif /projects/LSSTsky/Run2.1.1i/DESC_DC2_imSim_Workflow/docker_run.sh python /projects/LSSTsky/Run2.1.1i/DESC_DC2_imSim_Workflow/scripts/run_imsim.py --workdir /projects/LSSTsky/Run2.1.1i/run-test20190924/run/outputs/00445379to00497969/00479028/ --outdir /projects/LSSTsky/Run2.1.1i/run-test20190924/run/outputs/00445379to00497969/00479028/ --file_id 479028 --processes 64 --bundle_lists /projects/LSSTsky/Run2.1.1i/run-test20190924/parsl-auto-bundles.json --node_id node0 --visit_index 0 --ckpt_archive_dir /projects/LSSTsky/Run2.1.1i/run-test20190924/run/outputs/00445379to00497969/00479028/agn_ckpts/ --config /projects/LSSTsky/Run2.1.1i/DESC_DC2_imSim_Workflow/parsl_imsim_configs
Effectively only thing that changed was the name of the image. Make sure you pull it from docker:lsstdesc/dc2-imsim:Run2.2i-validation-v1
as opposed to the previous Dockerhub.
We need to ensure ALCF/NERSC and Grid are using the same scripts and configuration for simulation processing. As discussed here: https://github.com/LSSTDESC/DC2-production/issues/368#issuecomment-531262662
@villarrealas provided this list:
We should also identify the branch and tag on this repo that is used for processing, as well as this repo https://github.com/jamesp-epcc/run2.1i-scripts (or a
Run2.2i
version) which is believed to store the Grid configuration.