This is a testbed for system identification and forecasting of dynamical systems using the Hankel Alternative View of Koopman (HAVOK) algorithm and Sparse Identification of Nonlinear Dynamics (SINDy). This code is based on the work by Brunton & Kutz (2022) and Yang et. al. (2022).
The following performance optimization strategies have been employed in this code:
While for-loops are often more intuitive, most of the code has been vectorized to improve performance.
Some parts of the code where large matrix operations are performed have been adapted to leverage GPU computing if available.
During model training, HAVOK and SINDy are fast regression-based procedures. Depending on the choice of ML-method, training this model can be extremely fast or extremely slow. For example, Bootstrap Aggregation and Neural Networks train in parallel and are thus relatively fast for the typical sample sizes provided. Meanwhile, methods like Boosting, Random Forests, etc. are quite slow to train, partially due to lack of parallelization.
The performance of this code is also highly dependent on the size of the data $x$, the stackmax, and rmax. Also, the choice of these parameters affect how well the ML model is fitted, and how long it takes to find a good fit.
Since system forecasting is largely state-dependent, we cannot parallelize much of this code. We may be able to leverage parallel computing for multivariable data, however this is not currently implemented.
The main performance bottleneck in the code is the ML prediction during forecasting. The current methods of prediction in MATLAB for Bagging, Boosting, Random Forests, Neural Networks, etc. are extremely slow at single-step prediction. To alleviate this, we have included a C++ algorithm for the Random Forest Regression, which is linked to MATLAB using mex-files. In the future, ideally, we would develop our own C++ optimized code for all the ML methods in the MATLAB Coder and link them via mex, but this may take some time.
Unit Tests
The subcomponents of this code are quite simple and required minimal testing to verify their correctness. The functions will throw an error if the wrong type of input is provided, as specified by the arguments tab in each function.
We check the performance of the units using MATLAB's Performance Testing Framework. This framework systematically tests each function's execution time, robustness to input noise, etc.
The tests are performed by constructing a test file, which looks like:
This testing framework is quite sensitive, as we can see by including a tiny probability (0.0000001% chance) of an error occurring in the code:
function H = HankelMatrix(x,stackmax)
%HankelMatrix Arrange a Hankel matrix
n = length(x);
ij = (1:n-stackmax) + (0:(stackmax-1))';
H = x(ij);
if rand > 0.999999
error("something went wrong!")
end
runperf("tests/preallocationTest.m")
Done preallocationTest
__________
Failure Summary:
Name Failed Incomplete Reason(s)
===============================================================
preallocationTest/HankelMatrix X X Errored.
results =
[TimeResult](matlab:helpPopup matlab.perftest.TimeResult) with properties:
Name: 'preallocationTest/HankelMatrix'
Valid: 0
Samples: [49×7 table]
TestActivity: [54×12 table]
Totals:
0 Valid, 1 Invalid.
3.4134 seconds testing time.
Regression Tests
Integration Tests
Coverage
We check that all parts of the code are covered using MATLAB's Code Coverage Plugin:
The following performance optimization strategies have been employed in this code:
Unit Tests
The subcomponents of this code are quite simple and required minimal testing to verify their correctness. The functions will throw an error if the wrong type of input is provided, as specified by the arguments tab in each function.
We check the performance of the units using MATLAB's Performance Testing Framework. This framework systematically tests each function's execution time, robustness to input noise, etc.
The tests are performed by constructing a test file, which looks like:
This testing framework is quite sensitive, as we can see by including a tiny probability (0.0000001% chance) of an error occurring in the code:
Regression Tests
Integration Tests
Coverage
We check that all parts of the code are covered using MATLAB's Code Coverage Plugin: