Closed wukan1986 closed 3 months ago
I changed the algorithm and right now it runs about the same as NumPy.
Although I think we can speed it up further for the case of a single X column. I will update you.
Turns out it is not worth it to short-cut the single variable case, especially when nulls are present. Anyway, lstsq speed is on par with NumPy and even more stable on my machine.
FYI, I also changed to fat lto, which may improve performance by a tiny little bit but will make compiling much longer. If you are running locally, be aware of this.
100k rows, 10 predictive variables
Great job!!!
np.linalg.lstsq
7x fast thannum.lstsq