Open 12rambau opened 3 years ago
The Python backend was created for validation of the GPU backend(s), hence performance was not a concern. However, I would like to get rid of sklearn dependency. I am thinking about substituting it with statsmodels since it is going to be used other places as well in the future, and I have also heard that it is a thinner wrapper around the numpy code.
As mentioned in the PR I think I can still improve the numpy backend so I'll continue working on it
By reading your code i realized that you were using the
sklearn
lib. But according to what I found in thedevelop
branch, you are not using the intersect parameter of theLinearRegression
method.https://github.com/diku-dk/bfast/blob/476eb8b09e94906f80057d832104e1ed6dc2293e/bfast/monitor/python/base.py#L230
It means that using
LInearRegression
all-together is useless as the optimisation problem have a closed solution given byI am proposing a change to the lib using only numpy to compute the coefficients. According to the tests I've performed on my local computer, I get the same results twice as fast (44s with sklearn, 23 without).
It will also reduce the number of dependencies which is alway important.