mronkko / matrixpls

R package matrixpls
6 stars 5 forks source link

Simulating PLS-SEM with composite variables mode B #8

Open JulianGaviriaL opened 3 years ago

JulianGaviriaL commented 3 years ago

Dear Mikko,

First, thank you very much for the matrixpls package, As you mention in your papers, the sample size determination and statistical power analysis is overlooked along many publications implementing PLS-SEM. Therefore, this package significantly contributes to the improvement of this approach.

I have some questions regarding the following code aimed to run montecarlo simulations in a PLS-SEM model with composite variables mode B:

model<-"
# Regressions 
A =~ 0.5*x1 
A =~ 0.5*x2 
B =~ 0.4*x3 
B =~ 0.4*x4 
B ~ 0.1*A
"
output <- matrixpls.sim(500, model,
                         outerEstim = outerEstim.modeB,
                         n=200, multicore = TRUE, 
                         completeRep =TRUE)

Graphical representation: image

  1. Composite models assume that the constructs (A, B) are fully composed of their observable variables (x1, x2, x3, x4). Consequently, there is no error term at the construct level, and the observable variables can freely covary (Schubert et al 2021). Therefore, I did not model any error term or covariance. Does it makes sense?

  2. Following the lavaan notation the <~ operator defines the composite variables (A, B). I guess it did not work in my matrixpls model since all the results regarding the power values were 0.00.

  3. Help wanted: The results from the model above are senseless to me: Even with a sample size of N=800 and 1000 iterations, the path values indicated in the "Power (Not equal 0)" are extremely small. Can you please give me any hint?

P.D. I replicated the results from your tutorial (Aguirre-Urreta et al 2015). However, I have no idea what is wrong with my model.

Many thanks in advance for your comments.

> summary(output)
RESULT OBJECT
Model Type
[1] "function"
========= Fit Indices Cutoffs ============
           Alpha
Fit Indices   0.1  0.05  0.01 0.001  Mean    SD
       srmr 0.231 0.248 0.262 0.267 0.164 0.051
========= Parameter Estimates and Standard Errors ============
      Estimate Average Estimate SD Average SE Power (Not equal 0) Average Param Average Bias Coverage
B~A              0.039       0.062      0.072               0.099           0.1       -0.061    0.768
A=~x1            0.461       0.534      0.493               0.189           0.5       -0.039    0.778
A=~x2            0.488       0.530      0.494               0.207           0.5       -0.012    0.798
B=~x3            0.476       0.518      0.500               0.176           0.4        0.076    0.775
B=~x4            0.498       0.530      0.499               0.185           0.4        0.098    0.749
================== Replications =====================
Number of replications = 1007 
Number of converged replications = 1001 
Number of nonconverged replications: 
   1. Nonconvergent Results = 6 
   2. Nonconvergent results from multiple imputation = 0 
   3. At least one SE were negative or NA = 0 
   4. Nonpositive-definite latent or observed (residual) covariance matrix 
      (e.g., Heywood case or linear dependency) = 0

Reference -Aguirre-Urreta, M., Rönkkö, M., 2015. Sample Size Determination and Statistical Power Analysis in PLS Using R: An Annotated Tutorial. Commun. Assoc. Inf. Syst. 36, 33–51. https://doi.org/10.17705/1CAIS.03603.

-Schuberth, F., 2021. Confirmatory composite analysis using partial least squares: setting the record straight. Rev. Manag. Sci. 15, 1311–1345. https://doi.org/10.1007/s11846-020-00405-0

mronkko commented 3 years ago

Hi,

Your population model does not have any composites. You generating data for weakly correlated factors (A and B) that are both measured with two very unreliable indicators. In this case low power is expected.

I you define composite variables, then all effects must be mediated by the indicators that form the composite. For example, if A is defined as a1+a2, then you can only change A by changing a1 or a2 and this needs to be specified in the population model.

Mikko

On 30. Jul 2021, at 10.09, JulianGaviriaL @.**@.>> wrote:

Dear Mikko,

First, thank you very much for the matrixpls package, As you mention in your papers, the sample size determination and statistical power analysis is overlooked along many publications implementing PLS-SEM. Therefore, this package significantly contributes to the improvement of this approach.

I have some questions regarding the following code aimed to run montecarlo simulations in a PLS-SEM model with composite variables mode B:

model<-"

Regressions

A =~ 0.5x1 A =~ 0.5x2 B =~ 0.4x3 B =~ 0.4x4 B ~ 0.1*A " output <- matrixpls.sim(500, model, outerEstim = outerEstim.modeB, n=200, multicore = TRUE, completeRep =TRUE)

  1. Composite models assume that the constructs (A, B) are fully composed of their observable variables (x1, x2, x3, x4). Consequently, there is no error term at the construct level, and the observable variables can freely covary (Schubert et al 2021). Therefore, I did not model any error term or covariance. Does it makes sense?

  2. Following the lavaan notation the <~ operator defines the composite variables (A, B). I guess it did not work in my matrixpls model since all the results regarding the power values were 0.00.

  3. Help wanted: The results from the model above are senseless to me: Even with a sample size of N=800 and 1000 iterations, the path values indicated in the "Power (Not equal 0)" are extremely small. Can you please give me any hint?

P.D. I replicated the results from your tutorial (Aguirre-Urreta et al 2015). However, I have no idea what is wrong with my model.

Many thanks in advance for your comments.

summary(output) RESULT OBJECT Model Type [1] "function" ========= Fit Indices Cutoffs ============ Alpha Fit Indices 0.1 0.05 0.01 0.001 Mean SD srmr 0.231 0.248 0.262 0.267 0.164 0.051 ========= Parameter Estimates and Standard Errors ============ Estimate Average Estimate SD Average SE Power (Not equal 0) Average Param Average Bias Coverage B~A 0.039 0.062 0.072 0.099 0.1 -0.061 0.768 A=~x1 0.461 0.534 0.493 0.189 0.5 -0.039 0.778 A=~x2 0.488 0.530 0.494 0.207 0.5 -0.012 0.798 B=~x3 0.476 0.518 0.500 0.176 0.4 0.076 0.775 B=~x4 0.498 0.530 0.499 0.185 0.4 0.098 0.749 ================== Replications ===================== Number of replications = 1007 Number of converged replications = 1001 Number of nonconverged replications:

  1. Nonconvergent Results = 6
  2. Nonconvergent results from multiple imputation = 0
  3. At least one SE were negative or NA = 0
  4. Nonpositive-definite latent or observed (residual) covariance matrix (e.g., Heywood case or linear dependency) = 0

Reference -Aguirre-Urreta, M., Rönkkö, M., 2015. Sample Size Determination and Statistical Power Analysis in PLS Using R: An Annotated Tutorial. Commun. Assoc. Inf. Syst. 36, 33–51. https://doi.org/10.17705/1CAIS.03603https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdoi.org%2F10.17705%2F1CAIS.03603&data=04%7C01%7C%7C22b81661764f49cc4d4e08d95328ff00%7Ce9662d58caa44bc1b138c8b1acab5a11%7C1%7C0%7C637632257807455225%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=itroP7uclEQtswTerLEhU%2BcRJoS6YFWatEV1j7QRGjs%3D&reserved=0.

-Schuberth, F., 2021. Confirmatory composite analysis using partial least squares: setting the record straight. Rev. Manag. Sci. 15, 1311–1345. https://doi.org/10.1007/s11846-020-00405-0https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdoi.org%2F10.1007%2Fs11846-020-00405-0&data=04%7C01%7C%7C22b81661764f49cc4d4e08d95328ff00%7Ce9662d58caa44bc1b138c8b1acab5a11%7C1%7C0%7C637632257807455225%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=rSKUWm8fV7HJMHeqIVGtmLMFAv8bhgckf9LHPb16%2Bqk%3D&reserved=0

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmronkko%2Fmatrixpls%2Fissues%2F8&data=04%7C01%7C%7C22b81661764f49cc4d4e08d95328ff00%7Ce9662d58caa44bc1b138c8b1acab5a11%7C1%7C0%7C637632257807465217%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=fpqwfrE7FzizcrSWu60ky9DTZUEv1XODqm0NG6b3m9w%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAEKGTXTO4BPPZQANOJLUDDT2JF27ANCNFSM5BH6BCFA&data=04%7C01%7C%7C22b81661764f49cc4d4e08d95328ff00%7Ce9662d58caa44bc1b138c8b1acab5a11%7C1%7C0%7C637632257807465217%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=k1X5PGs7cO%2Fgq88L%2FltGyfraOSbCpSoYeIy4340cOGU%3D&reserved=0.

JulianGaviriaL commented 3 years ago

Mikko, thank you so much for your response,

Indeed, the values considerably changed once I put all the indicators in the way you suggested:

model<-"
# Regressions 
A =~ 0.8*x1 + 0.8*x2
B =~ 0.8*x3 + 0.8*x4
B ~ 0.8*A
"

output <- matrixpls.sim(1000, model,
                         outerEstim = outerEstim.modeB,
                         n=**55**, multicore = TRUE, 
                         completeRep =TRUE)
========= Fit Indices Cutoffs ============
           Alpha
Fit Indices   0.1  0.05  0.01 0.001  Mean   SD
       srmr 0.405 0.427 0.453  0.48 0.329 0.07
========= Parameter Estimates and Standard Errors ============
      Estimate Average Estimate SD Average SE Power (Not equal 0) Average Param Average Bias Coverage
B~A              0.422       0.116      0.141               0.795           0.8       -0.378    0.093
A=~x1            0.767       0.225      0.266               0.790           0.8       -0.033    0.922
A=~x2            0.768       0.251      0.261               0.781           0.8       -0.032    0.894
B=~x3            0.799       0.224      0.253               0.825           0.8       -0.001    0.893
B=~x4            0.801       0.215      0.248               0.814           0.8        0.001    0.908
================== Replications =====================
Number of replications = 1003 
Number of converged replications = 1001 
Number of nonconverged replications: 
   1. Nonconvergent Results = 2 
   2. Nonconvergent results from multiple imputation = 0 
   3. At least one SE were negative or NA = 0 
   4. Nonpositive-definite latent or observed (residual) covariance matrix 
      (e.g., Heywood case or linear dependency) = 0

However, I still wonder what is wrong with the composite model. In my view, the following values are senseless:

model<-"
# Regressions 
A <~ 0.8*x1 + 0.8*x2
B <~ 0.8*x3 + 0.8*x4
B ~ 0.8*A
"
output <- matrixpls.sim(1000, model,
                         outerEstim = outerEstim.modeB,
                         n=**550**, multicore = TRUE, 
                         completeRep =TRUE)
summary(output)
========= Fit Indices Cutoffs ============
           Alpha
Fit Indices 0.1 0.05 0.01 0.001 Mean SD
       srmr   0    0    0     0    0  0
========= Parameter Estimates and Standard Errors ============
      Estimate Average Estimate SD Average SE Power (Not equal 0) Average Param Average Bias Coverage
B~A             -0.001       0.079      0.092               0.033           0.8       -0.801    0.000
A<~x1            0.396       0.575      0.553               0.067           0.8       -0.404    0.725
A<~x2            0.386       0.605      0.551               0.093           0.8       -0.414    0.714
B<~x3            0.381       0.598      0.551               0.079           0.8       -0.419    0.710
B<~x4            0.374       0.600      0.555               0.071           0.8       -0.426    0.711
================== Replications =====================
Number of replications = 1010 
Number of converged replications = 1001 
Number of nonconverged replications: 
   1. Nonconvergent Results = 9 
   2. Nonconvergent results from multiple imputation = 0 
   3. At least one SE were negative or NA = 0 
   4. Nonpositive-definite latent or observed (residual) covariance matrix 
      (e.g., Heywood case or linear dependency) = 0
mronkko commented 3 years ago

Your second model is as follows:

A <~ 0.8x1 + 0.8x2 B <~ 0.8x3 + 0.8x4 B ~ 0.8*A

We can rewrite the model as

A <~ 0.8x1 + 0.8x2 B <~ 0.8x3 + 0.8x4 + 0.8*A

The problem here is that you do not specify how x3 and x4 are related to x1 and x2, which makes them uncorrelated. The outcome is that any composite of x1 and x2 will be uncorrelated with any composite of x3 and x4. If you want composite A to have an effect on composite B, you need to model the effect through the indicators of B. You cannot change a composite without changing the components and this needs to be specified in the population.

See Appendix A (and also probably Appendix F) in

Rönkkö, M., Evermann, J., & Aguirre-Urreta, M. I. (2016). Estimating formative measurement models in IS research: Analysis of the past and recommendations for the future. Unpublished Working Paper. http://urn.fi/URN:NBN:fi:aalto-201605031907

Hope this helps. Mikko

On 30. Jul 2021, at 23.06, JulianGaviriaL @.**@.>> wrote:

Thank you so much for your response Mikko,

Indeed, the values considerably changed once I put all the indicators in the way you suggested:

model<-"

Regressions

A =~ 0.8x1 + 0.8x2 B =~ 0.8x3 + 0.8x4 B ~ 0.8*A "

output <- matrixpls.sim(1000, model, outerEstim = outerEstim.modeB, n=55, multicore = TRUE, completeRep =TRUE) ========= Fit Indices Cutoffs ============ Alpha Fit Indices 0.1 0.05 0.01 0.001 Mean SD srmr 0.405 0.427 0.453 0.48 0.329 0.07 ========= Parameter Estimates and Standard Errors ============ Estimate Average Estimate SD Average SE Power (Not equal 0) Average Param Average Bias Coverage B~A 0.422 0.116 0.141 0.795 0.8 -0.378 0.093 A=~x1 0.767 0.225 0.266 0.790 0.8 -0.033 0.922 A=~x2 0.768 0.251 0.261 0.781 0.8 -0.032 0.894 B=~x3 0.799 0.224 0.253 0.825 0.8 -0.001 0.893 B=~x4 0.801 0.215 0.248 0.814 0.8 0.001 0.908 ================== Replications ===================== Number of replications = 1003 Number of converged replications = 1001 Number of nonconverged replications:

  1. Nonconvergent Results = 2
  2. Nonconvergent results from multiple imputation = 0
  3. At least one SE were negative or NA = 0
  4. Nonpositive-definite latent or observed (residual) covariance matrix (e.g., Heywood case or linear dependency) = 0

However, I still wonder what is wrong with the composite model. In my view, the following values are senseless:

model<-"

Regressions

A <~ 0.8x1 + 0.8x2 B <~ 0.8x3 + 0.8x4 B ~ 0.8*A " output <- matrixpls.sim(1000, model, outerEstim = outerEstim.modeB, n=55, multicore = TRUE, completeRep =TRUE) summary(output) ========= Fit Indices Cutoffs ============ Alpha Fit Indices 0.1 0.05 0.01 0.001 Mean SD srmr 0 0 0 0 0 0 ========= Parameter Estimates and Standard Errors ============ Estimate Average Estimate SD Average SE Power (Not equal 0) Average Param Average Bias Coverage B~A -0.001 0.079 0.092 0.033 0.8 -0.801 0.000 A<~x1 0.396 0.575 0.553 0.067 0.8 -0.404 0.725 A<~x2 0.386 0.605 0.551 0.093 0.8 -0.414 0.714 B<~x3 0.381 0.598 0.551 0.079 0.8 -0.419 0.710 B<~x4 0.374 0.600 0.555 0.071 0.8 -0.426 0.711 ================== Replications ===================== Number of replications = 1010 Number of converged replications = 1001 Number of nonconverged replications:

  1. Nonconvergent Results = 9
  2. Nonconvergent results from multiple imputation = 0
  3. At least one SE were negative or NA = 0
  4. Nonpositive-definite latent or observed (residual) covariance matrix (e.g., Heywood case or linear dependency) = 0

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmronkko%2Fmatrixpls%2Fissues%2F8%23issuecomment-890126110&data=04%7C01%7C%7C54f3d0fb92ea488269a108d953957b70%7Ce9662d58caa44bc1b138c8b1acab5a11%7C1%7C0%7C637632723764585536%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hmDvehIGWAnKaUX8ZGY%2FN53pWoyaXFXyp4qnJL6RhuE%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAEKGTVAQI4DG67475ELQKLT2MA3FANCNFSM5BH6BCFA&data=04%7C01%7C%7C54f3d0fb92ea488269a108d953957b70%7Ce9662d58caa44bc1b138c8b1acab5a11%7C1%7C0%7C637632723764595522%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=gq5SSJI52NdBYltrFELHOSgYHXahAOQgeQpmmC35v14%3D&reserved=0.