Time consuming calculations and zero variance estimates

BERENZ commented 8 years ago

Hi!

I wanted to compute a robust mixed model with two random effects based on EU-SILC data. In total I have 12 871 rows, PSU_POW random effect has 4 207 levels and pow random effect has 375 levels. Target variable is equivalised income with right skewness. Results that I've obtained for the non-robust mixed models are below.

> summary(model1)
Linear mixed model fit by REML ['lmerMod']
Formula: eq_inc ~ (1 | pow) + (1 | PSU_POW) + males + children + people30_44 +  
    people65 + unemployed + disabled + educ_elementary + educ_high +  
    room1 + room3 + village_city20 + lau1_unempl + lau1_nace_a +  
    lau1_benefits
   Data: for_model
Control: lmerControl(optimizer = "nloptwrap", calc.derivs = FALSE)

REML criterion at convergence: 281863.3

Scaled residuals: 
   Min     1Q Median     3Q    Max 
-3.067 -0.467 -0.101  0.293 36.174 

Random effects:
 Groups   Name        Variance  Std.Dev.
 PSU_POW  (Intercept)   6477688  2545   
 pow      (Intercept)   1650250  1285   
 Residual             185751234 13629   
Number of obs: 12871, groups:  PSU_POW, 4207; pow, 375

Fixed effects:
                Estimate Std. Error t value
(Intercept)      25421.8      592.4   42.92
males             3137.5      477.7    6.57
children         -3758.3      338.9  -11.09
people30_44       2082.2      567.4    3.67
people65         -2885.7      386.2   -7.47
unemployed      -15305.2      885.1  -17.29
disabled         -8822.6      751.3  -11.74
educ_elementary  -2841.4      451.3   -6.30
educ_high        18307.3      499.2   36.67
room1            -2299.4      447.7   -5.14
room3             2822.9      275.6   10.24
village_city20   -2403.3      340.5   -7.06
lau1_unempl     -10895.1     2734.2   -3.98
lau1_nace_a      -8809.5     1002.7   -8.79
lau1_benefits    -1614.3      384.8   -4.20

However, when I fit the same model using robust approach it takes about 15 hours. Final results that I've obtained after this time are strange because variances of these two random effects are equal zero. Summary of the model indicate that weights for random effects are all equal to one, while for residuals robust weights vary.

> summary(robustmodel1)
Robust linear mixed model fit by DASvar 
Formula: eq_inc ~ (1 | pow) + (1 | PSU_POW) + males + children + people30_44 +      people65 + unemployed + disabled + educ_elementary + educ_high +      room1 + room3 + village_city20 + lau1_unempl + lau1_nace_a +      lau1_benefits 
   Data: for_model 
Control: lmerControl(calc.derivs = FALSE) 

Scaled residuals: 
   Min     1Q Median     3Q    Max 
-4.709 -0.626 -0.060  0.614 60.321 

Random effects:
 Groups   Name        Variance Std.Dev.
 PSU_POW  (Intercept)        0    0    
 pow      (Intercept)        0    0    
 Residual             72309542 8504    
Number of obs: 12871, groups: PSU_POW, 4207; pow, 375

Fixed effects:
                Estimate Std. Error t value
(Intercept)      23449.9      316.4   74.12
males             3124.6      300.3   10.41
children         -3373.0      212.8  -15.85
people30_44       1526.6      356.5    4.28
people65         -1568.8      242.4   -6.47
unemployed      -13346.2      555.1  -24.04
disabled         -7552.3      471.7  -16.01
educ_elementary  -2722.6      282.8   -9.63
educ_high        15874.0      311.9   50.89
room1            -2232.4      280.2   -7.97
room3             2189.8      172.0   12.73
village_city20   -2238.4      203.0  -11.03
lau1_unempl      -8319.8     1400.8   -5.94
lau1_nace_a      -7436.0      527.8  -14.09
lau1_benefits    -1421.4      201.0   -7.07

Robustness weights for the residuals: 
 9993 weights are ~= 1. The remaining 2878 ones are summarized as
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.0223  0.5360  0.7490  0.7030  0.9100  0.9990 

Robustness weights for the random effects: 
 All 4582 weights are ~= 1.

Rho functions used for fitting:
  Residuals:
    eff: smoothed Huber (k = 1.345, s = 10) 
    sig: smoothed Huber, Proposal II (k = 1.345, s = 10) 
  Random Effects, variance component 1 (PSU_POW):
    eff: smoothed Huber (k = 1.345, s = 10) 
    vcp: smoothed Huber, Proposal II (k = 1.345, s = 10) 
  Random Effects, variance component 2 (pow):
    eff: smoothed Huber (k = 1.345, s = 10) 
    vcp: smoothed Huber, Proposal II (k = 1.345, s = 10)

Therefore, I have to questions: (a) why variance estimates are 0 ? What I'm doing wrong? Maybe I should change smoothing function? (b) is it possible to speed up computations?

If you need more information about data please let me know!

kollerma commented 8 years ago

Dear Berenz,

Thanks for the interest in my package. You're trying to fit quite a large dataset - I'm happy to see that your machine had enough memory to satisfy the extensive memory requirements while setting up the objects.

Answers to your questions: (a) The default settings are geared towards maximum robustness, but they will result in a low efficiency of the variance component estimates. I recommend to use a higher tuning constant for estimating the variance parameters (rho.sigma.e = psi2propII(smoothPsi, k = 2.28), rho.sigma.b = psi2propII(smoothPsi, k = 2.28)). This will also help with variance components estimated as zero.

(b) Sure, a C++ implementation would help :-) More seriously, you might want to try using method = "DASvar" for testing. But be aware, that while this approach is faster, it only gives an approximation to the solution and increases the chances of variance components estimated as zero.

Best, Manuel

BERENZ commented 8 years ago

Dear Manuel,

thank you for response. During weekend I'll try to run model with parameters that you've proposed. Concerning your comment on size of dataset, to be honest it is one of the smallest that I have. ;)

Maybe rewriting some functions using Armadillo (RcppArmadillo) could be a good solution? Which function is the most time and RAM consuming?

Best, Maciej

kollerma commented 8 years ago

Dear Maciej,

I hope you have lots of memory then... The memory requirements and the time consuming operations happen in two different places. As far as I recall, the memory intensive stuff happens at the point when the initial lmerMod object from lme4 is converted into an rlmerMod object. That code needs to become smarter. I have never drilled down to see where the most memory consuming operations happens, but it is bound to be called from .convLme4Rlmer.

Now to get the runtime down, this would require the fitting routines to be sped up. The computation of the correction factors (calcTau and calcTau.nondiag) could make a good start. But eventually everything would have to be rewritten. An approach using RcppArmadillo would certainly help. I'm not exactly sure whether this is what the lme4 team is using, but in any case I'd go with the same approach they're using.

Best, Manuel

On Fri, Mar 18, 2016 at 7:41 AM, Maciej Beręsewicz <notifications@github.com

wrote:

Dear Manuel,

thank you for response. During weekend I'll try to run model with parameters that you've proposed. Concerning your comment on size of dataset, to be honest it is one of the smallest that I have. ;)

Maybe rewriting some functions using Armadillo (RcppArmadillo) could be a good solution? Which function is the most time and RAM consuming?

Best, Maciej

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/kollerma/robustlmm/issues/4#issuecomment-198241238

kollerma commented 6 years ago

You might want to try the new rlmerRcpp function. It should be more memory efficient and slightly faster than the R implementation.

kollerma / robustlmm

Time consuming calculations and zero variance estimates #4