Closed fossell closed 1 year ago
WRF support team general guidelines and has an FAQ and python script to help determine number of cores.
A good STARTING PLACE is to use the following equations:
For your smallest-sized domain: ((e_we)/25) * ((e_sn)/25) = most amount of processors you should use
For your largest-sized domain: ((e_we)/100) * ((e_sn)/100) = least amount of processors you should use
and then play around with it from there to see if you can find a good balance for the domain set-up you’re using, checking the decomposition and number of tiles. Keep in mind this is just a rule-of-thumb, so you may be able to pick something at the far end of one of those 2 values, or somewhere right in the middle. You may also be able to outside those boundaries, depending on the decomposition. Each run is a little bit different.
Python script from WRF support to help determine number of cores.
# This script finds the largest number of processors and nodes
# you can use, based on the number of grid points in the i/j directions
# on your domain.
# Note: The largest number may not decompose the best way. If you want
# additional values, set some print statements in the code below
# enter the namelist values of e_we and e_sn
e_we = 200
e_sn = 250
# number of cores you want to use per node (Cheyenne has a max of 36/node)
cores = 36
# The value for 'cores' gets incremented later, so we want a static variable for the original value
cores_orig = cores
# set upper limit of nodes - the max you want to loop through
node_max = 200
# This is the least number of grid points allowed for each processor.
# Dont' change this value.
smallest_size = 10
x = 1
while x <= node_max:
# finds the factor pairs for the total number of cores
def f(cores):
factors = []
for i in range(1, int(cores**0.5)+1):
if cores % i == 0:
factors.append((i, cores/i ))
return factors
factors = f(cores)
# Of the factor pairs, this finds the closest values (pair) in that array
closest_factors = factors[-1]
# Of the set of closest values, assign the i and j values
i_array_value = closest_factors[0]
j_array_value = closest_factors[-1]
# Calculate how the domain will be decomposed
e_we_decomp = int(e_we / i_array_value )
e_sn_decomp = int(e_sn / j_array_value )
# Once the decomposition becomes smaller than the least number of grid points
# allowed for each processor, the loop will quit and display the max
# number of processors and nodes you can use for your domain.
if ((e_sn_decomp < smallest_size) or (e_we_decomp < smallest_size)):
# test to see if the max number of processors allowed is within the number for a single node
initial_factor_pair = factors[0]
initial_factor = initial_factor_pair[-1]
if initial_factor == cores_orig:
# start with value of cores_orig and decrease by 1 for each iteration
# until the value is allowed
y = cores_orig
while y >= 1:
processors = y
# finds the factor pairs for the total number of processors
# still testing processor values for a single node
def f(processors):
factors = []
for i in range(1, int(processors**0.5)+1):
if processors % i == 0:
factors.append((i, processors/i ))
return factors
factors = f(processors)
# Of the factor pairs, this finds the closest values (pair) in that array
# still testing processor values for a single node
closest_factors = factors[-1]
# Of the set of closest values, assign the i and j values
# still testing processor values for a single node
i_array_value = closest_factors[0]
j_array_value = closest_factors[-1]
# Calculate how the domain will be decomposed
# still testing processor values for a single node
e_we_decomp = int(e_we / i_array_value )
e_sn_decomp = int(e_sn / j_array_value )
# Once the decomposition becomes larger or equal to the least number of grid points
# allowed for each processor, the loop will quit and display the max
# number of processors and nodes you can use for your domain.
if ((e_sn_decomp >= smallest_size) and (e_we_decomp >= smallest_size)):
max_procs = (i_array_value * j_array_value)
print "max # of processors that can be used is: ", max_procs
print "max # of nodes that can be used is 1 "
break
# if you haven't reached your limit, the loop continues
# still testing processor values for a single node
else:
y -= 1
# if the size of the domain allows multiple nodes
else:
max_procs = (i_array_value * j_array_value) - cores_orig
max_nodes = (max_procs / cores_orig)
print "max # of processors that can be used is: ", max_procs
print "max # of nodes that can be used is: ", max_nodes
break
# If you haven't reached your limit, the loop continues
x += 1
cores = (cores+cores_orig)
It seems like we could implement a computation for a starting point or first guess at core needs.
UI will have "auto compute" set/checked by default. Currently that is set to 96 cores, but placeholder exists to add in code to auto compute based on namelist.wps settings. In the meantime, users can uncheck the auto compute and select a number of cores, using the UG documentation as guidance. Issue #141 details the adding of code to autocompute.
Closed with #174 .
Describe the Enhancement
New model configurations may fail with WRF and UPP if over/under decomposed. Can't always use 96 cores by default, especially with small coarse domains. Users will be given the option to custom set the number of cores on the UI when creating a new model configuration. Do we let the forecast fail and then change the cores? Do we try an auto-compute function? How do we determine/guide this selection?
Time Estimate
2 days
Sub-Issues
Consider breaking the enhancement down into sub-issues.
Relevant Deadlines
List relevant project deadlines here or state NONE.
Define the Metadata
Assignee
Labels
Projects and Milestone
Enhancement Checklist
feature_<Issue Number>/<Description>
feature <Issue Number> <Description>