sfcheung / semhelpinghands

An R package to store any handy SEM-related functions that I use
https://sfcheung.github.io/semhelpinghands/
GNU General Public License v3.0
0 stars 0 forks source link

Problem running the standardizedSolution_boot_ci() function #74

Closed kachunwu closed 6 months ago

kachunwu commented 7 months ago

When I try to standardize the bootstrap solution from lavaan, I encounter this error message:

Error in if (all(abs(out_all[, x] - mean(out_all[, x], na.rm = TRUE)) < : missing value where TRUE/FALSE needed

Is it because of the bootstrap draw that did not converged?

This is the command I used to fit the model: fit <- sem(model, data = dataset, estimator = "ML", se = "bootstrap",bootstrap = 2000, parallel="snow",ncpus = 16, iseed=1234, fixed.x = FALSE)

sfcheung commented 7 months ago

Thanks for reporting this issue. If possible, please provide a reprex to reproduce the error. However, I understand this may not be possible if the data is private. If this is the case, may you use lavInspect(fit, "boot") to inspect the bootstrap estimates generated by lavaan::sem() to see if there is anything unusual? E.g., is there any column with NA for all bootstrap samples? Please also let me know the verion of lavaan you used.

If the model failed to converge in some bootstrap samples, lavaan should set the estimates to NAs. These bootstrap samples with nonconvergence will then be ignored by standardizedSolution_boot_ci() in the same way lavaan does for the unstandardized solution.

However, it is possible that there are situations which I overlooked, so I would like to learn more about your situation.

This is an example with some bootstrap samples failed in estimation:

library(lavaan)
#> This is lavaan 0.6-17
#> lavaan is FREE software! Please report any bugs.
library(semhelpinghands)

set.seed(1234)
mod0 <-
"
x1 ~~ .3*x2
x1 ~~ .3*x3
x2 ~~ .1*x3
"
dat <- simulateData(model = mod0,
                    sample.nobs = 50,
                    empirical = TRUE)
mod <- "f1 =~ x1 + x2 + x3"

suppressWarnings(system.time(fit <- sem(mod,
                       data = dat,
                       se = "boot",
                       bootstrap = 100,
                       iseed = 5678)))
#>    user  system elapsed 
#>    0.67    0.00    2.62
summary(fit)
#> lavaan 0.6.17 ended normally after 23 iterations
#> 
#>   Estimator                                         ML
#>   Optimization method                           NLMINB
#>   Number of model parameters                         6
#> 
#>   Number of observations                            50
#> 
#> Model Test User Model:
#>                                                       
#>   Test statistic                                 0.000
#>   Degrees of freedom                                 0
#> 
#> Parameter Estimates:
#> 
#>   Standard errors                            Bootstrap
#>   Number of requested bootstrap draws              100
#>   Number of successful bootstrap draws              75
#> 
#> Latent Variables:
#>                    Estimate  Std.Err  z-value  P(>|z|)
#>   f1 =~                                               
#>     x1                1.000                           
#>     x2                0.333    0.817    0.408    0.683
#>     x3                0.333    0.420    0.794    0.427
#> 
#> Variances:
#>                    Estimate  Std.Err  z-value  P(>|z|)
#>    .x1                0.100    1.134    0.088    0.930
#>    .x2                0.900    0.348    2.589    0.010
#>    .x3                0.900    0.274    3.290    0.001
#>     f1                0.900    1.157    0.778    0.437

ci_boot <- standardizedSolution_boot_ci(fit, save_boot_est_std = TRUE)
ci_boot
#>   lhs op rhs est.std    se     z pvalue ci.lower ci.upper boot.ci.lower
#> 1  f1 =~  x1   0.949 0.598 1.586  0.113   -0.224    2.121         0.312
#> 2  f1 =~  x2   0.316 0.709 0.446  0.656   -1.074    1.706         0.148
#> 3  f1 =~  x3   0.316 0.334 0.946  0.344   -0.339    0.971         0.047
#> 4  x1 ~~  x1   0.100 1.135 0.088  0.930   -2.124    2.324        -4.417
#> 5  x2 ~~  x2   0.900 0.449 2.006  0.045    0.021    1.779        -0.491
#> 6  x3 ~~  x3   0.900 0.211 4.257  0.000    0.486    1.314         0.366
#> 7  f1 ~~  f1   1.000 0.000    NA     NA    1.000    1.000            NA
#>   boot.ci.upper boot.se
#> 1         2.326   0.431
#> 2         1.218   0.214
#> 3         0.795   0.208
#> 4         0.903   1.145
#> 5         0.977   0.302
#> 6         0.995   0.184
#> 7            NA      NA

lavInspect(fit, "boot")
#>        f1=~x2 f1=~x3 x1~~x1 x2~~x2 x3~~x3 f1~~f1
#>   [1,]  0.451  0.347  0.090  0.902  0.870  0.839
#>   [2,]  0.637  0.830  0.709  0.973  0.417  0.579
#>   [3,]     NA     NA     NA     NA     NA     NA
#>   [4,]     NA     NA     NA     NA     NA     NA
#>   [5,]     NA     NA     NA     NA     NA     NA
#>   [6,]     NA     NA     NA     NA     NA     NA
#>   [7,]  0.334  1.176  0.637  0.500  0.474  0.324
#>   [8,]  0.452  0.247  0.317  0.839  0.906  0.756
#>   [9,]     NA     NA     NA     NA     NA     NA
#>  [10,]     NA     NA     NA     NA     NA     NA
#>  [11,]  0.462  0.806  0.622  0.950  0.461  0.608
#>  [12,]  0.793  0.761  0.272  0.632  0.624  0.737
#>  [13,]     NA     NA     NA     NA     NA     NA
#>  [14,]  0.216  0.234 -0.798  1.142  0.768  1.999
#>  [15,]  0.160  0.171 -1.063  1.027  0.741  2.190
#>  [16,]  0.397  0.550 -0.311  0.908  1.111  1.010
#>  [17,]  0.641  0.710  0.513  0.516  0.383  0.627
#>  [18,]  0.609  0.625  0.169  0.687  0.558  0.625
#>  [19,]  0.278  0.380  0.113  1.111  0.956  1.079
#>  [20,]     NA     NA     NA     NA     NA     NA
#>  [21,]  0.170  0.311 -0.070  0.859  0.908  1.127
#>  [22,]  0.374  0.527  0.361  1.075  1.031  0.621
#>  [23,]  0.304  0.284 -0.322  0.835  1.086  1.328
#>  [24,]  0.316  0.190 -0.325  1.136  1.014  1.603
#>  [25,]     NA     NA     NA     NA     NA     NA
#>  [26,]  0.227  0.259 -0.027  0.885  0.620  0.787
#>  [27,]  0.294  0.849  0.474  0.501  0.486  0.639
#>  [28,]  0.320  0.092 -0.153  0.956  1.058  1.163
#>  [29,]  0.327  0.413 -0.293  1.028  1.025  1.197
#>  [30,]  0.718  0.362  0.049  0.804  0.536  0.583
#>  [31,]  0.595  0.984  0.875  0.677  0.436  0.454
#>  [32,]     NA     NA     NA     NA     NA     NA
#>  [33,]  1.025  0.511  0.370  0.660  0.678  0.515
#>  [34,]     NA     NA     NA     NA     NA     NA
#>  [35,]  0.598  1.161  0.507  0.685  0.733  0.234
#>  [36,]  0.399  0.894  0.452  0.983  0.498  0.484
#>  [37,]  0.285  0.181 -0.662  0.792  1.128  1.501
#>  [38,]  0.036  0.372  0.229  1.018  0.911  0.656
#>  [39,]  4.906 -0.481  0.902 -0.556  1.145  0.077
#>  [40,]  0.270  0.242 -0.070  1.046  0.850  1.086
#>  [41,]     NA     NA     NA     NA     NA     NA
#>  [42,]  0.115  0.117 -1.781  0.986  0.775  2.633
#>  [43,]  0.298  0.376  0.301  0.720  1.360  0.565
#>  [44,]  0.702  0.968  0.463  0.822  0.600  0.406
#>  [45,]     NA     NA     NA     NA     NA     NA
#>  [46,]  0.560  0.826  0.571  1.083  0.544  0.313
#>  [47,]  0.069  0.072 -4.073  0.988  1.145  5.392
#>  [48,]     NA     NA     NA     NA     NA     NA
#>  [49,]     NA     NA     NA     NA     NA     NA
#>  [50,]  0.540  0.509  0.655  0.939  0.694  0.297
#>  [51,]  0.064  0.035 -4.719  0.900  1.375  5.815
#>  [52,]  1.200  0.698  0.820  0.651  0.659  0.399
#>  [53,]  1.172  0.715  0.340  0.435  0.853  0.262
#>  [54,]  0.267  0.371  0.097  0.742  0.834  0.791
#>  [55,]  0.383  0.466  0.036  0.798  1.277  0.975
#>  [56,]  0.984  1.691  0.710  0.983  0.321  0.172
#>  [57,]     NA     NA     NA     NA     NA     NA
#>  [58,]     NA     NA     NA     NA     NA     NA
#>  [59,]     NA     NA     NA     NA     NA     NA
#>  [60,]  0.907  0.668  0.450  0.513  0.747  0.407
#>  [61,]  1.003  1.850  0.808  0.977  0.412  0.193
#>  [62,]  0.285  0.212 -0.219  0.806  0.700  1.222
#>  [63,]  0.443  0.907  0.636  0.971  0.800  0.403
#>  [64,]     NA     NA     NA     NA     NA     NA
#>  [65,]  0.666  2.054  0.586  0.881  0.084  0.113
#>  [66,]  0.312  0.348  0.017  0.504  0.988  0.965
#>  [67,]  0.644  0.771  0.605  1.163  0.750  0.484
#>  [68,]  0.106  0.097 -1.538  1.020  0.828  2.598
#>  [69,]  0.180  0.429 -0.328  1.313  0.811  1.116
#>  [70,]     NA     NA     NA     NA     NA     NA
#>  [71,]  0.370  0.627  0.188  0.693  0.683  0.717
#>  [72,]  0.427  1.153  0.649  0.844  0.474  0.275
#>  [73,]  0.408  0.451 -0.005  0.737  0.987  1.129
#>  [74,]  0.088  0.029 -3.223  1.059  0.942  4.223
#>  [75,]  0.347  0.081 -0.598  1.021  1.250  1.320
#>  [76,]  0.812  1.262  0.421  0.923  0.378  0.230
#>  [77,]  0.220  0.193 -0.246  0.884  0.953  1.077
#>  [78,]  0.591  0.695  0.871  0.874  0.739  0.442
#>  [79,]     NA     NA     NA     NA     NA     NA
#>  [80,]  0.228  0.768  0.431  0.908  0.898  0.547
#>  [81,]     NA     NA     NA     NA     NA     NA
#>  [82,]  0.414  1.156  0.652  0.901  0.297  0.350
#>  [83,]  0.225  0.443 -0.086  0.867  0.805  1.253
#>  [84,]  0.313  0.434 -0.167  0.658  0.769  1.141
#>  [85,]  5.542  0.478  0.726 -1.346  0.950  0.080
#>  [86,]  0.587  0.647  0.231  0.817  0.724  0.609
#>  [87,]     NA     NA     NA     NA     NA     NA
#>  [88,]  0.589  0.548  0.026  0.812  1.027  0.799
#>  [89,]  0.928  0.402  0.717  0.751  0.846  0.412
#>  [90,]  0.179  0.240 -0.517  0.788  1.272  1.363
#>  [91,]  0.165  0.098 -1.036  1.058  1.002  2.064
#>  [92,]  0.691  0.596  0.271  0.913  0.683  0.434
#>  [93,]     NA     NA     NA     NA     NA     NA
#>  [94,]  0.959  0.728  0.359  0.697  1.206  0.317
#>  [95,]  0.661  0.504  0.706  0.743  1.201  0.380
#>  [96,]  0.233  0.775  0.707  0.705  0.405  0.574
#>  [97,]  0.058  0.068 -4.878  0.749  1.073  5.724
#>  [98,]  0.688  0.658  0.439  0.552  0.740  0.437
#>  [99,]  0.530  0.633  0.580  1.079  0.740  0.330
#> [100,]     NA     NA     NA     NA     NA     NA
#> attr(,"error.idx")
#>  [1]   3   4   5   6   9  10  13  20  25  32  34  41  45  48  49  57  58  59  64
#> [20]  70  79  81  87  93 100
#> attr(,"nonadmissible")
#>  [1] 14 15 16 21 23 24 26 28 29 37 39 40 42 47 51 62 68 69 73 74 75 77 83 84 85
#> [26] 90 91 97
#> attr(,"seed")
#> [1] 5678

Created on 2024-03-26 with reprex v2.1.0

kachunwu commented 7 months ago

Thank you for the quick response. Sorry that I cannot share the data and the model but I did check the boot oject using lavInspect(fit, "boot"). There is no single column (paths) with NA for all bootstrap sample, but only few rows (the nonconverge sample).

lavaan version: 0.6-17 semTools version: 0.5-6 semhelpinghands version: 0.1.10

Not sure if it is relevant, but when fitting my model I received the following:

> Warning messages:
> 1: In lav_model_nvcov_bootstrap(lavmodel = lavmodel, lavsamplestats = lavsamplestats,  :
>   lavaan WARNING: 17 bootstrap runs failed or did not converge.
> 2: In lav_model_nvcov_bootstrap(lavmodel = lavmodel, lavsamplestats = lavsamplestats,  :
>   lavaan WARNING: 4983 bootstrap runs resulted in nonadmissible solutions.
> 3: In lav_object_post_check(object) :
>   lavaan WARNING: the covariance matrix of the residuals of the observed
>                 variables (theta) is not positive definite;
>                 use lavInspect(fit, "theta") to investigate. 

I have tried another functions like the following and successfully plotted the histogram and QQ plot:

fit_with_boot_std <- store_boot_est_std(fit)
plot_boot(fit.full, "p1", standardized = TRUE)

Another try with user-defined parameter. This time, the standardized option give error:

fit_with_boot_def <- store_boot_def(fit.full)
plot_boot(fit_with_boot_def, "total1", standardized = TRUE)
> Error in plot_boot(fit_with_boot_def, "total1", standardized = TRUE) : 
>   Bootstrap estimates not found or not stored.

When I run standardizedSolution_boot_ci() with fit_with_boot_std or fit_with_boot_def, it still gives the same error message.

standardizedSolution_boot_ci(fit_with_boot_std)
> Error in if (all(abs(out_all[, x] - mean(out_all[, x], na.rm = TRUE)) <  : 
>   missing value where TRUE/FALSE needed

Since the error happen in the following function of standardizedSolution_boot_ci.R, I am thinking is there something missing in my fitted lavaan object that is required by this function.

# Adapted from boot
    boot_ci <- sapply(seq_along(est_org), function(x) {
                          if (all(abs(out_all[, x] -
                                      mean(out_all[, x], na.rm = TRUE)) <
                                      1e-8) ||
                              all(is.na(out_all[, x]))) {
                              return(c(NA, NA))
                            }
                          boot::boot.ci(boot_tmp,
                                index = x,
                                type = "perc",
                                conf = level)$percent[4:5]
                        })

I am seeking alterantive ways to do it like lavaan::bootstrapLavaan(). Hope to update you about my progress.

sfcheung commented 7 months ago

Thanks for the information, which is useful. I will see how to reprodue the error on my end.

By the way, you are right. You can use lavaan::bootstrapLavaan() too. This is an illustration:

https://groups.google.com/g/lavaan/c/wkI3f_SPubc/m/_JjYZNwcDwAJ

You may want to add se = FALSE in the call to standardizedSolution(), to prevent computing the SEs, which you won't need in bootstrapping. It can be slow to run if you did not add se = FALSE.

I wrote standardizedSolution_boot_ci() to reuse the bootstrap estimates already generated when running sem() or cfa() with se = "bootstrap". Using lavaan::bootstrapLavaan() will refit the models many many times, even if bootstrapping has already been done once. This is inefficient and can take a long time when bootstrapping and FIML are used together. However, if time is not a concern, then lavaan::bootstrapLavaan() is enough for the task.

sfcheung commented 7 months ago

A quick question, not directly related to standardizedSolution_boot_ci() (though may be related indirectly as some bootstrap replications may not have error but the solution is still nonadmissible):

You mention the following warning:

> 3: In lav_object_post_check(object) :
>   lavaan WARNING: the covariance matrix of the residuals of the observed
>                 variables (theta) is not positive definite;
>                 use lavInspect(fit, "theta") to investigate. 

Will you have this warning if you do not use se = "bootstrap"? If yes, then something's wrong when fitting the model. This warning need to be addressed first.

sfcheung commented 7 months ago

Regarding the other two functions you mentioned:

fit_with_boot_std <- store_boot_est_std(fit)
plot_boot(fit.full, "p1", standardized = TRUE)

store_boot_est_std() stores the bootstrap estimates for all parameters, including user-defined parameters created by :=.

store_boot_def(), on the other hand, stores only the bootstrap estimates of the user-defined parameters in the unstandardized solution. Therefore, setting standardized = TRUE will result in an error if only store_boot_def() was run:

fit_with_boot_def <- store_boot_def(fit.full)
plot_boot(fit_with_boot_def, "total1", standardized = TRUE)
> Error in plot_boot(fit_with_boot_def, "total1", standardized = TRUE) : 
>   Bootstrap estimates not found or not stored.

Therefore, if you only need to plot the bootstrap estimates of the standardized solution, then using store_boot_est_std() is enough, even for the user-defined parameters:

library(lavaan)
#> This is lavaan 0.6-17
#> lavaan is FREE software! Please report any bugs.
library(semhelpinghands)

set.seed(1234)
mod0 <-
"
x1 ~~ .3*x2
x1 ~~ .15*x3
x2 ~~ .4*x3
"
dat <- simulateData(model = mod0,
                    sample.nobs = 50,
                    empirical = TRUE)
mod <-
"
x2 ~ a*x1
x3 ~ b*x2
ab := a*b
"
set.seed(1234)
suppressWarnings(system.time(fit <- sem(mod,
                       data = dat,
                       se = "boot",
                       bootstrap = 100)))
#>    user  system elapsed 
#>    0.42    0.03    0.77

fit_with_boot_std <- store_boot_est_std(fit)
plot_boot(fit_with_boot_std, "a", standardized = TRUE)

plot_boot(fit_with_boot_std, "ab", standardized = TRUE)

Created on 2024-03-27 with reprex v2.1.0

I think you used those functions just to help figuring out the source of the error (thanks for that) but your example on these functions also shows that I need to come up with a better error message. That one can be misleading. Thanks.

sfcheung commented 7 months ago

I guess this is one possible cause of the original error: There are bootstrap samples in which the unstandardized solution has no problem but some parameters cannot be computed or have invalid values (Inf or NaN) in the standardized solution. This type of cases is rare but possible, and cannot be discovered by insepcting the results of lavInspect() with "boot".

I did a quick fix (Version 0.1.10.1) to handle this situation. You can install the latest version from GitHub:

remotes::install_github("sfcheung/semhelpinghands")

Please use this vesion and see whether you still encounter the error. Thanks in advance.

I need more time to think about how to handle this case in the plot_boot() function and will update it soon.

sfcheung commented 6 months ago

The latest version, 0.1.11, with this issue addressed is now on CRAN.