Added a fast_matmul_matmul_2x2() for chained matmuls that broadcasts correctly, and changed GaussianImageGalaxy.gaussians to a 1D structure at @bd-j's suggestion. I'm using a list instead of a Numpy array (since the Numba code actually produces a list of jitclass), but that could be changed back to an ndarray of objects pretty easily if needed.
This speeds up convert_to_gaussians() a little bit, from 104 µs to 90 µs.
Added a
fast_matmul_matmul_2x2()
for chainedmatmul
s that broadcasts correctly, and changedGaussianImageGalaxy.gaussians
to a 1D structure at @bd-j's suggestion. I'm using a list instead of a Numpy array (since the Numba code actually produces a list of jitclass), but that could be changed back to an ndarray of objects pretty easily if needed.This speeds up
convert_to_gaussians()
a little bit, from 104 µs to 90 µs.