tbates / umx

Making Structural Equation Modeling (SEM) in R quick & powerful
https://tbates.github.io/
44 stars 17 forks source link

umxEFA takes 4 hours to calculate #103

Closed AgnerF closed 4 years ago

AgnerF commented 4 years ago

The umxEFA function took 4 hours to calculate on my data set with a 70x40 matrix with 40% missing values. The calculation time went down to 10 minutes when I removed the call to function umxSummary from umxEFA.

I don't think I need the umxSummary anyway. It does not affect the return value from umxEFA, but only prints something on the screen. I don't know why the umxSummary is so slow, but there should be a way to avoid calling it.

I general, I prefer that statistical functions put the calculated results in a return rather than print them on the screen. This will allow the user to decide what to do with the results: print, plot, save to a file, or combine with further calculations in a longer script. In the case of the umxEFA function, I will propose that this function should not print any results but just return the data set m1. The user can then print the loadings, umxSummary, or whatever is needed.

BTW. I compared the results of umxEFA with a factor analyses based on a covariance matrix calculated by package 'norm' or 'norm2'. The three results were very different. Any idea why, or which one is best?

Thanks for developing umx.

tbates commented 4 years ago

Hi @AgnerF

The devtools version now has a summary =FALSE option. summary is slow because it computes the independence and saturated models and for 40 variables, that will take forever. SO, off by default now.

I also made a change so printing is suppressed if umx_set_silent(TRUE)

Users like models to run and print themselves but I get the benefit of chaining, so this seems a reasonable compromise?

Currently the output is either model, scores, or loadings. Might be useful to make scores a model attribute.... But then people might struggle to find them. Will consider the UI.

Thanks for your input!!