Open config-i1 opened 6 years ago
After some experiments and digging: (2) seems to be already pretty fast, even beating C++ code. (3) can be optimised via Choleski decomposition (318741f0439d196ba91d2e258a069bd3d8a99d63).
The only problem that is left is (1).
vcov is now calculated only when the user asks for it. So (1) can be sort of ignored.
But another problem is that alm() is still slower than lm() even for distribution="dnorm". It seems that this is because of the slow distribution functions that we use for the likelihood calculation (e.g. dnorm()). And the optimisation does not make things fast, but it needs to be there so that we get better estimates of parameters (especially for the weird distributions).
Create parameter that allows skipping all the checks in alm(). This might speed up the process and simplify some calculations.
Remove terms() and qr() - they are nor needed anymore. Provide xreg instead with exogenous variables
terms and qr are removed in cf0eb6b9b40b32941f3fa1f8ba2a23039dabfee5 data is provided instead of model and / or xreg.
I relation to (2), I had a similar problem when constructing a basic LSTM.
I found that using Rcpp provides a slight speed up to both matrix multiplication and outer product.
I have both functions in my repo you can give them a try if you'd like.
Please note that both can be used as either prefix and infix functions:
matrixMultiplication(mat, vec)
= %m%
outerProduct(v1, v2)
= %op%
By the way.. Keep up the good work - I really enjoy your packages. :-)
Thanks for the link!
I've implemented functions like that at some point and compared the simple multiplication with the C++ one, but the difference was not significant. In fact, in some cases, the conventional %*% did better (I think this is due to improvements in R starting from 3.5.0). But this was done in Rcpp and in RcppArmadillo (two different functions). Is RcppEigen more efficient?
Interesting package, by the way!
Thanks!
I implemented the functions in both Rcpp, RcppArma, and RcppEigen and found the Eigen version to perform best. However, my motivation for not using the conventional %*%
was not just that of achieving higher speed, but also that I had to be sure of the output. With %*%
, the output was sometimes a matrix, sometimes a vector, which would require me to build a check and coercion function that would entail a second evaluation and slow down the network, regardless of the speed of %*%
.
In terms of speed, I found a benchmark of RcppArma versus RcppEigen that shows Eigen to be faster than Arma when the matrices are mapped. Also, it shows that the time complexity of matrix multiplication with mapped Eigen matrices is lower, meaning that efficiency gains become more apparent with larger matrices.
However, when I copy his code, I don't get the same results. Instead, I find that %*%
is comparable or slightly faster than the mapped RcppEigen and that the computational cost of using a custom %*%
function diminishes when the matrix size increase. I believe that this is a product of him using an old version of R, which does not contain the BLAS optimization in current versions (see options("matprod")
). So for now, I believe that the best and most efficient approach is to stick with base matrix multiplication.
Thanks for the detailed explanation! It makes sense.
alm() is slow for several reasons:
While (2) and (3) are doable, they won't fix (1). Not sure what to do with it...