I found that with just a little work using python's Numba to get from 3x to 50x speedup on the base implementation.
At a quick glance it looks like it may be quite easy to use with the remaining unoptomized parts of our script.
Namely that big "running DT" for j & for k loop.
Suggested by John David:
At a quick glance it looks like it may be quite easy to use with the remaining unoptomized parts of our script. Namely that big "running DT"
for j
&for k
loop.