Closed BERENZ closed 6 years ago
Dear Berenz,
Thanks for the interest in my package. You're trying to fit quite a large dataset - I'm happy to see that your machine had enough memory to satisfy the extensive memory requirements while setting up the objects.
Answers to your questions: (a) The default settings are geared towards maximum robustness, but they will result in a low efficiency of the variance component estimates. I recommend to use a higher tuning constant for estimating the variance parameters (rho.sigma.e = psi2propII(smoothPsi, k = 2.28), rho.sigma.b = psi2propII(smoothPsi, k = 2.28)). This will also help with variance components estimated as zero.
(b) Sure, a C++ implementation would help :-) More seriously, you might want to try using method = "DASvar" for testing. But be aware, that while this approach is faster, it only gives an approximation to the solution and increases the chances of variance components estimated as zero.
Best, Manuel
Dear Manuel,
thank you for response. During weekend I'll try to run model with parameters that you've proposed. Concerning your comment on size of dataset, to be honest it is one of the smallest that I have. ;)
Maybe rewriting some functions using Armadillo (RcppArmadillo) could be a good solution? Which function is the most time and RAM consuming?
Best, Maciej
Dear Maciej,
I hope you have lots of memory then... The memory requirements and the time consuming operations happen in two different places. As far as I recall, the memory intensive stuff happens at the point when the initial lmerMod object from lme4 is converted into an rlmerMod object. That code needs to become smarter. I have never drilled down to see where the most memory consuming operations happens, but it is bound to be called from .convLme4Rlmer.
Now to get the runtime down, this would require the fitting routines to be sped up. The computation of the correction factors (calcTau and calcTau.nondiag) could make a good start. But eventually everything would have to be rewritten. An approach using RcppArmadillo would certainly help. I'm not exactly sure whether this is what the lme4 team is using, but in any case I'd go with the same approach they're using.
Best, Manuel
On Fri, Mar 18, 2016 at 7:41 AM, Maciej Beręsewicz <notifications@github.com
wrote:
Dear Manuel,
thank you for response. During weekend I'll try to run model with parameters that you've proposed. Concerning your comment on size of dataset, to be honest it is one of the smallest that I have. ;)
Maybe rewriting some functions using Armadillo (RcppArmadillo) could be a good solution? Which function is the most time and RAM consuming?
Best, Maciej
— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/kollerma/robustlmm/issues/4#issuecomment-198241238
You might want to try the new rlmerRcpp function. It should be more memory efficient and slightly faster than the R implementation.
Hi!
I wanted to compute a robust mixed model with two random effects based on EU-SILC data. In total I have 12 871 rows, PSU_POW random effect has 4 207 levels and pow random effect has 375 levels. Target variable is equivalised income with right skewness. Results that I've obtained for the non-robust mixed models are below.
However, when I fit the same model using robust approach it takes about 15 hours. Final results that I've obtained after this time are strange because variances of these two random effects are equal zero. Summary of the model indicate that weights for random effects are all equal to one, while for residuals robust weights vary.
Therefore, I have to questions: (a) why variance estimates are 0 ? What I'm doing wrong? Maybe I should change smoothing function? (b) is it possible to speed up computations?
If you need more information about data please let me know!