environmental-forecasting / model-ensembler

Model Ensemble tool for batch workflows on HPCs
https://pypi.org/project/model-ensembler/
MIT License
13 stars 1 forks source link

Add a repeat parameter to the ensembler? #26

Closed CRosieWilliams closed 2 years ago

CRosieWilliams commented 2 years ago

Is it possible to add a repeat parameter into the batch configuration for the ensembler?

At the moment (as I understand) it runs through every ensemble member (job) once, keeping to <= maxjobs, and then when it's got through all the jobs, it resubmits the whole batch. If it could resubmit some of the jobs that have timed out (but not finished) before it finishes the whole batch once, this would save some time because it sometimes ends up with just a few (<maxjobs) running before resubmitting the whole lot again. Keeping maxjobs running at all times, where possible, would be great.

JimCircadian commented 2 years ago

Need to ensure that failing jobs are not resubmitted on the next pass, otherwise the ensemble just continually prods SLURM with useless jobs