I'm trying to follow the instructions at https://github.com/desihub/surveysim/blob/master/doc/tutorial.md to run the survey simulations at NERSC. I got about half way through the survey and then the loop stopped. I didn't realize until much later that it hadn't completed the survey and by then I didn't have the window history where the exact loop had run, but when I go back and run surveyplan I get:
[cori04 desisim] surveyplan ${PLAN_ARGS}
INFO:utils.py:94:freeze_iers: Freezing IERS table used by astropy time, coordinates.
INFO:ephemerides.py:94:__init__: Loaded ephemerides from /global/homes/s/sjbailey/desi/dev/end2end/output/ephem_2019-08-28_2024-07-13.fits for 2019-08-28 to 2024-07-13
INFO:progress.py:128:__init__: Loaded progress from /global/homes/s/sjbailey/desi/dev/end2end/output/progress.fits.
INFO:progress.py:262:save: Saved progress to /global/homes/s/sjbailey/desi/dev/end2end/output/progress_2022-05-24.fits.
INFO:surveyplan.py:121:main: Planning observations for 2022-05-24 to 2022-09-01.
INFO:plan.py:266:update: Updating plan for 2022-05-24 to 2022-09-01
INFO:plan.py:237:update_active: Adding 24 active tiles from group 1 priority 5
INFO:plan.py:237:update_active: Adding 7 active tiles from group 2 priority 7
INFO:plan.py:237:update_active: Adding 269 active tiles from group 5 priority 4
INFO:plan.py:237:update_active: Adding 15 active tiles from group 6 priority 6
Optimizing 31 active DARK tiles.
INFO:optimize.py:153:__init__: DARK program: 177.3h to observe 31 tiles (texp_nom 1000.0 s).
Optimizing 0 active GRAY tiles.
INFO:optimize.py:153:__init__: GRAY program: 39.0h to observe 0 tiles (texp_nom 1000.0 s).
Cannot improve MSE.
[cori04 desisim] echo $?
255
[cori04 desisim] echo $PLAN_ARGS
--duration 100 --verbose --plots
In a private email thread, @dkirkby commented
I suspect the problem is that it tried to optimize zero tiles for the gray program, which it should be able to handle gracefully but didn't. I guess this is relatively rare since I didn't run into this with my tests.
I am in the process of reworking this logic and my dev branch has diverged enough from what you are running that it probably doesn't make sense to track this bug down. Instead, I need to finish up my latest round of changes and get them merged.
After that refactor and prior to the next big tag, we should verify that the tutorial instructions work at NERSC on master.
Getting this into a ticket for tracking.
I'm trying to follow the instructions at https://github.com/desihub/surveysim/blob/master/doc/tutorial.md to run the survey simulations at NERSC. I got about half way through the survey and then the loop stopped. I didn't realize until much later that it hadn't completed the survey and by then I didn't have the window history where the exact loop had run, but when I go back and run surveyplan I get:
In a private email thread, @dkirkby commented
After that refactor and prior to the next big tag, we should verify that the tutorial instructions work at NERSC on master.