Improve time step calculation performance (not first priority)

xaviernogueira commented 1 year ago

Once we get closer to having sub-modules completed, we can focus on improving the performance of the core base.Model code. Since this code is inherited, any benefits are passed to all sub-modules. Some experimenting can/will be done to see if we can squeeze out time_step performance. Model init performance is less relevant.

We can use snakeviz with Cprofiler to assess any computational rate limiting steps.

xaviernogueira commented 1 year ago

Update -> some promising results! @aufdenkampe @imscw95 @kewalak

I took the time to do some experimentation on the core "computation engine", this is currently being done on a notebook in a separate branch but will be merged in. By "computation engine" I am referring to the part of the code that applies all calculations, regardless of which module is being used.

So far I tested:

V1 - The existing set up, where we use xr.apply_ufunc in a function that is JIT compiled with @numba.jit(forceobj=True).
V2 - The same as V1, except no more JIT compiling. It is well documented that forceobj=True dramatically reduces the efficacy of JIT.
V3 - A new approach where we use map() to create an iterable that contains everything necessary to update a timestep, but as numpy arrarys NOT xr.DataArrays. Note that I protect against excess memory usage by instantiating a type=map iterable, therefore all the input arrays are not pre-calculated.
V4 - An attemt to Numba JIT the inner loop of V3 -> turns out this is impossible, as each set of inputs will have a different process function, and Numba nopython=True mode can't handle "heterogenous list" inputs.

Results: As shown below, we can see that the new V3 approach is nearly 50% faster than the existing method (17ms shaved off). V2 did not speed things up at all. We can also see that the iter_computation() call within increment_timestep is responsible for 76% of the run time. Therefore our 50% reduction in `iter_computation()`` should account for a 36% reduction in timestep run time.

Next steps: I am going to implement V3 into the main modules, see if all tests pass, and if so I will merge into the main branch.

aufdenkampe commented 1 year ago

@xaviernogueira, thank you for doing this performance profiling and playing with computational improvements to find this valuable performance boost!

EcohydrologyTeam / ClearWater-modules

Improve time step calculation performance (not first priority) #24