SURGroup / UQpy

UQpy (Uncertainty Quantification with python) is a general purpose Python toolbox for modeling uncertainty in physical and mathematical systems.
MIT License
267 stars 75 forks source link

Launch tiled multi-core jobs in RunModel #201

Closed shellshocked2003 closed 1 year ago

shellshocked2003 commented 1 year ago

Tiled parallel jobs in RunModel

Description

This pull request adds a cluster execution model to RunModel. The user provides a cluster- and scheduler-specific script that uses the appropriate commands to launch the computationally intensive portions of the simulation in parallel on the cluster. The example provided in the documentation for this pull request shows how multi-core jobs can be tiled over available resources using a bash for-loop.

Related Issue

Please link to the issue here: #200

Motivation and Context

By adding this capability, it gives users more freedom in how resources are leveraged on different HPC systems. It substantially broadens the scale of analyses that can be performed within the RunModel module.

How Has This Been Tested?

This new execution model was tested both locally on Ubuntu 20.04 and on the Savio Condo Cluster at UC Berkeley. Savio uses a slurm scheduler, so the Python script that initiates the RunModel workflow is called in the slurm batch script.

Types of changes

What types of changes does your code introduce? Put an x in all the boxes that apply:

Checklist:

Go over all the following points, and put an x in all the boxes that apply. If you're unsure about any of these, don't hesitate to ask. We're here to help!

dimtsap commented 1 year ago

This pull request is already merged to Development. This will be closed because it targets master.