wjschne / simstandard

Creates simulated data from structural equation models with standardized loadings
https://wjschne.github.io/simstandard/
Creative Commons Zero v1.0 Universal
6 stars 2 forks source link

Simulate categorical data #1

Closed brianmsm closed 4 years ago

brianmsm commented 4 years ago

How can I simulate categorical data?

Here is an example with lavaan package:

library(lavaan)
#> This is lavaan 0.6-5
#> lavaan is BETA software! Please report any bugs.
pop.model <- ' f1 =~ .8*u1 + .8*u2 + .8*u3

                u1 | 0*t1
                u2 | -1.5*t1 + 0.1*t2 + 1.2*t3
                u3 | -0.5*t1 + 0.5*t2
              ' 
model1 <- simulateData(pop.model, sample.nobs=100) 
tibble::as_tibble(model1)
#> # A tibble: 100 x 3
#>       u1    u2    u3
#>    <dbl> <dbl> <dbl>
#>  1     2     2     2
#>  2     1     2     1
#>  3     1     2     2
#>  4     2     3     2
#>  5     1     2     1
#>  6     1     2     2
#>  7     2     4     3
#>  8     2     2     2
#>  9     1     3     2
#> 10     1     2     1
#> # … with 90 more rows

Created on 2020-04-26 by the reprex package (v0.3.0)

And here when I use simstandard package:

library(simstandard)
pop.model <- ' f1 =~ .8*u1 + .8*u2 + .8*u3

                u1 | 0*t1
                u2 | -1.5*t1 + 0.1*t2 + 1.2*t3
                u3 | -0.5*t1 + 0.5*t2
              ' 
model2 <- sim_standardized(pop.model, n = 100)
#> Warning in sim_standardized_matrices(m, ...): Although it is sometimes possible to set standardized parameters greater than 1 or less than -1, it is rare to do so. More often than not, it causes model convergence problems. Check to make sure you set such a value on purpose. The following paths were set to values outside the range of -1 to 1:
#> u2 | -1.5 * t1
#> u2 | 1.2 * t3
model2
#> # A tibble: 100 x 10
#>         u1     u2      u3     t1     t2      t3      f1   e_u1    e_u2    e_u3
#>      <dbl>  <dbl>   <dbl>  <dbl>  <dbl>   <dbl>   <dbl>  <dbl>   <dbl>   <dbl>
#>  1 -0.988  -1.18  -1.78    1.33   1.26  -0.272  -2.19    0.760  0.569  -0.0351
#>  2 -0.629   0.832  0.845   0.492  0.595  1.99    0.0195 -0.644  0.816   0.829 
#>  3 -0.0634 -0.904 -0.245  -0.709 -1.06  -0.878  -0.596   0.413 -0.427   0.232 
#>  4 -0.687   0.291  0.627  -1.24  -1.03  -0.0359  0.341  -0.960  0.0187  0.355 
#>  5 -1.61   -0.340 -1.87    1.16   2.10   0.148  -1.29   -0.575  0.691  -0.836 
#>  6  0.709  -0.675  1.11    1.32   1.98   0.651  -0.0866  0.779 -0.606   1.18  
#>  7  2.87    1.76   1.83   -0.303 -0.169  0.663   2.88    0.559 -0.546  -0.477 
#>  8 -0.490   0.593 -0.0155  0.350  0.266  0.588   0.565  -0.942  0.141  -0.468 
#>  9 -1.20   -0.620 -0.959  -0.635  0.214  0.491  -1.05   -0.358  0.221  -0.118 
#> 10 -1.95   -1.96  -2.67   -0.460  0.272 -0.124  -3.25    0.654  0.642  -0.0661
#> # … with 90 more rows

Created on 2020-04-26 by the reprex package (v0.3.0)

wjschne commented 4 years ago

For now, the simstandard package only simulates multivariate normal data. Thus, it is not yet possible to simulate categorical data. I will need to insert an error message whenever simulating categorical data is attempted.

brianmsm commented 4 years ago

Oh... I understand, don't worry

wjschne commented 4 years ago

Thanks

melissagwolf commented 4 years ago

I'd like to express an interest in this feature as well!