Closed rburghol closed 6 months ago
Performance test. Integer-keyed versus character-keyed STATE
Dict. Tests performed for a 50-year, 1-hour timestep simulation.
0.1693 and 0.2163 seconds
0.0866 and 0.1266 seconds
0.0428 and 0.0434 seconds
0.0866 and 0.1266 seconds
import time
import numpy as np
from numba import njit, jit
from math import sin, cos
from numba import types
from numpy import zeros, any, full, nan, array, int64
from numba.typed import Dict
steps = 365 * 24 * 50# 50 year hourly simulation
# Character-indexed
ts = Dict.empty(key_type=types.unicode_type, value_type=types.float64[:])
state = Dict.empty(key_type=types.unicode_type, value_type=types.float64)
# set up some base data
ts['/RESULTS/RCHRES_001/SPECL/Qin'] = zeros(steps)
ts['/RESULTS/RCHRES_001/SPECL/Qlocal'] = zeros(steps)
ts['/RESULTS/RCHRES_001/SPECL/Qout'] = zeros(steps)
state['/RESULTS/RCHRES_001/SPECL/Qin'] = 0.0
state['/RESULTS/RCHRES_001/SPECL/Qlocal'] = 0.0
state['/RESULTS/RCHRES_001/SPECL/Qout'] = 0.0
# Integer-indexed
ts_ix = Dict.empty(key_type=types.int64, value_type=types.float64[:])
state_ix = Dict.empty(key_type=types.int64, value_type=types.float64)
# set up some base data
ts_ix[1] = zeros(steps)
ts_ix[2] = zeros(steps)
ts_ix[3] = zeros(steps)
state_ix[1] = 0.0
state_ix[2] = 0.0
state_ix[3] = 0.0
@njit
def iterate_specl_ts(ts, step):
ts['/RESULTS/RCHRES_001/SPECL/Qlocal'][step] = sin(step)
ts['/RESULTS/RCHRES_001/SPECL/Qin'][step] = abs(cos(5 * step))
ts['/RESULTS/RCHRES_001/SPECL/Qout'][step] = ts['/RESULTS/RCHRES_001/SPECL/Qlocal'][step] + ts['/RESULTS/RCHRES_001/SPECL/Qin'][step]
return
# Force a Compile
iterate_specl_ts(ts, 0)
@njit
def iterate_specl_ts_ix(ts_ix, step):
ts_ix[1][step] = sin(step)
ts_ix[2][step] = abs(cos(5 * step))
ts_ix[3][step] = ts_ix[1][step] + ts_ix[2][step]
return
# Force a Compile
iterate_specl_ts_ix(ts_ix, 0)
@njit
def run_ts(ts, steps):
for step in range(steps):
iterate_specl_ts(ts, step)
start = time.time()
run_ts(ts, steps)
end = time.time()
print(end - start, "seconds")
# Run times between 0.1693 and 0.2163 seconds
@njit
def run_ts_ix(ts_ix, steps):
for step in range(steps):
iterate_specl_ts_ix(ts_ix, step)
run_ts_ix(ts_ix, 1):
start = time.time()
run_ts_ix(ts_ix, steps)
end = time.time()
print(end - start, "seconds")
# Run times between 0.0866 and 0.1266 seconds
@njit
def iterate_specl_state(state, step):
state['/RESULTS/RCHRES_001/SPECL/Qlocal'] = sin(step)
state['/RESULTS/RCHRES_001/SPECL/Qin'] = abs(cos(5 * step))
state['/RESULTS/RCHRES_001/SPECL/Qout'] = state['/RESULTS/RCHRES_001/SPECL/Qlocal'] + state['/RESULTS/RCHRES_001/SPECL/Qin']
return
# Force a Compile
iterate_specl_state(state, 0)
@njit
def iterate_specl_state_ix(state_ix, step):
state_ix[1] = sin(step)
state_ix[2] = abs(cos(5 * step))
state_ix[3] = state_ix[1] + state_ix[2]
return
# Force a Compile
iterate_specl_state_ix(state_ix, 0)
@njit
def run_state(state, steps):
for step in range(steps):
iterate_specl_state(state, step)
run_state(state, 1)
start = time.time()
run_state(state, steps)
end = time.time()
print(end - start, "seconds")
# Run times between 0.1407 and 0.1451 seconds
@njit
def run_state_ix(state_ix, steps):
for step in range(steps):
iterate_specl_state_ix(state_ix, step)
run_state_ix(state_ix, 1)
start = time.time()
run_state_ix(state_ix, steps)
end = time.time()
print(end - start, "seconds")
# Run times between 0.0428 and 0.0434 seconds
Done. Need to move to documentation. See module #126
Currently,
hsp2
features domain-specific (i.e. a singleRCHERS
,PERLND
, etc) simulation and data access, with with only the current segment data, i.e. thets
Dict, parameters, and internal calculation state loaded, passed to the functional routine. To support the legacy SPEC-ACTIONS (#90) and potential new enhanced model modularity features, a robust data model is needed to facilitate passingSTATE
across multiple domains (segments) and amongst the various functions. This issue attempts to outline a proposed data structure schema to facilitate this, including performance considerations, and functional considerations. (see also: #126 )Goals
STATE
should allow calculations to be done on any publicly modifiable state variable (individual code/model domains must opt-in toSTATE
sharing).numba/@njit
STATE
data structure does not guarantee that it modifies model behavior.Benefits
Draft Data Model
Integer keyed STATE
state_paths
is 1-d, string keyed, paths based on the hierarchical paths stored in thehdf5
. Its keys are full path to hdf5STATE
, and its values point to the integer key for use in all other runtime Dicts. The key is generated automatically at the beginning of model loading and is static throughout the simulation.STATE
forRCHRES 1
,IVOL
has an hdf5 path of/STATE/RCHRES_R001/IVOL
/STATE/RCHRES_R001/IVOL
is the 25th item added to theSTATE
during model loading, and thus it's integer index is25
/STATE/RCHRES_R001/IVOL
is found in state_paths, such thatstate_paths['/STATE/RCHRES_R001/IVOL'] = 25
state_ix
: Dict that holds scalar numeric state values. It is an integer keyed, version ofhdf5
.RCHRES 1
,IVOL
state value is1803.5
IVOL
inRCHRES 1
variable is 25state_ix[25] = 1803.5
dict_ix
is integer keyed Dict to store values of array/matrix objects.ts_ix
- Dict of timeseries data (TBD: may be redundant todict_ix
, given that all ts data can be keyed via an hdf5 path)Concepts
hsp2
adoption of hdf5/STATE/RCHRES_R001/IVOL
/STATE/RCHRES_R002/IVOL
hsp2
model structure executes all timesteps for each spatial domain at one timehsp2
routines, however, base functions are fairly amenable to extracting timestep loop code.STATE
Dict showed a large performance hit over using integer-keyedSTATE
Dict. This may be due to the way in which the character-keyed Dict was set up. Testing code to be included in this issue soon.Implementation
numba
compatibility requires separate storage forDict
may perform faster than Character indexedDict
, therefore, this should be optimized. (see example below).Data Structure Option 1 - Character keyed STATE
state_ix
: Holds numeric state values. It is a single-dimension, hdf5 path keyed Dict.dict_ix
Holds values of array/matrix objects. It is multi-dimensional hdf5 path keyed Dict.ts_ix
- Holds values of timeseries dat, a (TBD: may be redundant todict_ix
, given that all ts data can be keyed via an hdf5 path)Data Structure Option 2
See above Draft Data Model