Closed mkavulich closed 2 years ago
I just finished testing. I ran 5 tests:
96 cores: Ran successfully in 18 mins 26 s (end-to-end) 64 cores: Ran successfully in 19 mins 55 s (end-to-end) 16 cores: Ran successfully in 51 mins 37 s (end-to-end) 1 core: Real failed (side note: did not error out at real; moved on to wrf and failed there when it didn't have the real outputs) 0 cores: Failed with appropriate message that invalid number of cores was provided.
One thing I did notice is that the wrfcloud-run command will only work when executed in the directory where the test.yml is found. Not a huge deal for R&D testing but wanted to mention it in case that wasn't expected behavior.
@fossell That is expected behavior, and should be a prerequisite; otherwise there's no way to indicate where to look for the file! I suppose we could put this in the environment variables yaml, but I think it's besy to leave that as a prerequisite for now.
This PR adds the "wrfcores" config option. If set to 1 (default), the previous serial-run behavior is maintained. If >1, the real.exe and WRF tasks will be submitted as a parallel job to the slurm queue with
wrfcores
number of cores. If >96, will fail with an error message, as the current default instance only has 96 cores available.Resolves #74
Expected Differences
Pull Request Testing
Ran tests with 1, 36, and 96 cores. 1 core defaulted to original (serial) behavior as expected (fails at real.exe due to lack of memory). 36 and 96 core tests worked as anticipated, with 96 cores running at ~20x realtime (0.9 wallclock seconds per 20s simulated time step). Attempted to run with 100 cores, and received the appropriate error message.
Test procedure
Edit test.yml to the desired settings, especially testing the
wrfcores
value mentioned above.See above instructions for running your own tests. As I mentioned, the existing commands result in an out-of-memory condition.
[ ] Do these changes include sufficient documentation updates, ensuring that no errors or warnings exist in the build of the documentation? [Yes or No]
[ ] Do these changes include sufficient testing updates? [Yes or No]
[ ] Will this PR result in changes to the test suite? [Yes or No] If yes, describe the new output and/or changes to the existing output:
[ ] Please complete this pull request review by [Fill in date].
Pull Request Checklist