EFAshiny
EFAshiny
is an user-friendly application for exploratory factor
analysis (EFA; Bartholomew, Knott, & Moustaki, 2011). The graphical user
interface in shiny (Chang, Cheng, Allaire, Xie, & McPherson, 2017) is
designed to free users from scripting in R by wrapping together various
packages for data management, factor analysis, and graphics.
Easy-to-follow analysis flow and reasonable default settings avoiding
common errors (Henson & Roberts, 2006) are provided. Results of analysis
in tables and graphs are presented on-line and can be exported.
Key features include:
- An easy-to-use GUI to free users from scripting in R
- A step by step analysis flow to perform EFA
- Quick ways to summarize data by tables or graphs
- Several ways to explore factor retention numerically or graphically
- Several ways to explore factor extraction and rotation numerically
or graphically
- A display of confidence intervals for factor loadings
- Several ways to link visualization of correlation matrix with factor
structure
- Default options are chosen according to recommendations in the
literature
- A demonstration using a real psychological scale dataset
The EFAshiny
application is primarily aimed at behavioral researchers
who want to perform EFA on a set of associated variables (e.g.,
item-level scale dataset). Note that it can also be used to explore
FA-based connectivity analyses (McLaughlin et al., 1992) in instrument
data, such as event related potentials (ERPs) and functional
near-infrared spectroscopy (fNIRS) data. Though the major focus of
EFAshiny
is to perform EFA, it is worth noting that confirmatory
factor analysis (CFA) is an useful future direction for shiny
APP.
Getting Started
1. Github version (Full version)
To run EFAshiny
on your R,
devtools
and shiny
are required.
install.packages("devtools")
install.packages("shiny")
Install and launch EFAshiny
:
devtools::install_github("PsyChiLin/EFAshiny")
EFAshiny::EFAshiny()
2. Shiny APP version (Standard version)
If you want to use the standard version of EFAshiny
, installation is
not required. The application is deployed on shinyapps.io server.
This standard version has all the function except for the Editor
tab
(which is only useful for users who want to code online). Users can
easily explore and analyze their data with this online APP without
worrying about installation.
Have fun with EFAshiny
:
https://psychilin.shinyapps.io/EFAshiny/
Tutorial
1. Exploratory Factor Analysis
EFAshiny
adopts exploratory factor analysis (EFA, Bartholomew, Knott,
& Moustaki, 2011), a widely used method to investigate the underlying
factor structure that can be used to explain the correlations in a set
of observed indicators, as the major procedure in the application. EFA
can be useful in lots of situations. For example, it can be used to
conceptualize new constructs, to develop instruments, to select items as
a short form scale, or to organize observed variables into meaningful
subgroups. Major procedures of EFA included correlation coefficients
calculation, number of factors determination, factor extraction, and
factor rotation. In addition to the aforementioned steps of EFA, data
explorations should be conducted before using EFA, and interpreting the
results after using EFA is also an important step. Since that EFA is
helpful to account for the relationship between numerous variables, its
use has permeated fields from psychology to business, education and
clinical domain.
2. Introduction
When you open EFAshiny
,
the interface will be shown.
- Upper Panel: The upper panel show 7 main tabs for the EFA
procedure. The order of the tabs from left to right is the suggested
flow. Users can easily switch the step of the EFA by simply clicking
the tabs.
- Left Panel: The left panel is used to control the analysis
setting or change the arguments.
- Right Panel: The right panel displays the results, tables and
figures.
In the Introduction
tab, you can see the main features for EFAshiny
,
a demo figure, and some key references.
3. Data Input
The data sets that required the implementations of EFA are typically in
a wide format, i.e., one observation per row.
They are composed of
a set of responses in one or more psychometric tests in Likert
scale.
In the Data Input
tab, users can upload the data.
- Upload data-file: Users can upload their data by browsing their
computer.
- Data Format: Two kinds of data can be uploaded, including csv
and txt.
- Header of variable: Users can choose whether their data have
variable names or not.
- Type of Data: Two data types for EFA are available, including
the typcial subject by variable raw data and the correlation matrix
data type.
- Variables to include: User can choose the variables they want to
include in the further steps. Simply delete the variable name from
the console.
If no data is uploaded, EFAshiny
will use the Rosenberg Self-Esteem
Scale
dataset to perform the default demostrations.
4. Data Summary
After uploading the data, the exploratory data analysis should be
conducted.
In Data Summary
tab, three types of explorations are
provided.
- Numeric Statistic: The first to fourth order moments for each
variable were automatically calculated and printed without worrying
about inputting any arguments. Median and MAD are provided as well.
- Histogram: Histograms that demonstrated numbers of observations
conditioned on the points of Likert scale (e.g. 1 to 4 points)
reported the distribution of each variable.
- Density Plot: Density plots are provided. Users can visualize
the distribution of each item accroding to the histograms and
density plots. Note that the histograms and density plots are
generated using
plotly
package. In other words, they can be played
dynamically. Try it with some clicks !
- Correlation Matrix: A bird’s eye view of the pairwise
correlation between variables will be illustrated.
- Type of correlation: Tetrachoric correlations can be adopted
to calculate the correlations between bivariates, and Polychoric
correlations can be used on dichotomous ordinal variables. The
default argument is set to Pearson’s correlation coefficients.
- ggcorrplot: In addition to the
Correlation Matrix
tab using
corrplot
package, we also provide a ggcorrplot
version. Have fun
with those plots and further get some intuitions.
Note that the provided correlation matrix is the basis of EFA, which is
a procedure that aim to investigate the underlying structure from the
correlations between variables, so either calculating or visualizing the
correlation matrix will be really important.
5. Factor Retention
One of the central idea of the EFA is to represent a set of observed
variables by a smaller number of factors. Thus, selecting how many
factors to retain is a critical decision.
In Factor Retention
tab, a set of indices to determine numbers of factor are provided.
- Scree Plot and Parallel Analysis: Scree Plot (Cattell, 1966) and
Parallel Analysis (Horn, 1965) are two popular methods to determine
numbers of factor.
- Quantile of Parallel analysis: Mean, 95th- and
99th-percentile eigenvalues of random data can be used as
criteria.
- Number of simulated analyses to perform: Users can perform
more simulation to obstain reliable results. In general, the
default 200 is correct enough.
- Numeric Rules: Very simple structure complexitiy (VSS),
Velicer’s minimum average partial (MAP, Velicer, 1976) test,
RMSEA, BIC and SRMR are also provided as the objective numeric
rules.
- Max Number of Factor For Estimation: Users should define
their max number of factor to estimate. Should be more than
hypothesized.
- Exploratory Graph Analysis (EGA): EGA is a new approach, which
is based on the graphical lasso with the regularization parameter
specified using EBIC, for retaining factors (Golino & Epskamp,2017).
- Number of simulated analyses to perform: Users can perform
more simulation to obstain reliable results. Note that too much
simulated analyses will somehow slow down the EGA.
- Summary: We provide a easy summary for all these methods. Users
can easy make a decision for the number of factors according to the
summary.
In addition, Sample Size is another option for users to validate the
results for factor retentions by randomly adjusting different Sample
Size.
Although users still have to determine the number of factors
upon their own decisions, EFAshiny
provides users several indices
without worrying on methods implementations.
6. Extraction and Rotation
The major step of EFA is to extract and rotate the factors structure,
further estimating the factor loadings.
In Extraction and Rotation
tab, several factor extraction and rotation methods are
available, and the boostrapping for estimating confidence intervals of
factor loadings is also provided to aide in interpretations.
- Factor Extraction Methods: Available methods included principal
axes method (PA), maximum likelihood method (ML), minimum residual
method (minres), weighted least squares (WLS), generalized weighted
least squares (GLS), and so on. The default option is PA, which has
a long history and well performance in psychological studies.
- Rotation Methods: The objective of factor rotation is to obtain
a simple structure for better interpretation. Both orthogonal
(e.g. variamx method) and oblique rotations (e.g. promax method)
are adopted. Using oblique rotations is recommended.
- Number of Bootstraps: By using bootstrapping resampling methods,
users can obtain interval estimations rather than point estimations.
Number of bootstrapping to perform can be changed based on users’
needs.
By providing plenty of factor extraction methods, rotation methods, and
useful interval estimations of factor loadings, EFAshiny
is not only
helpful for EFA newbies, but also flexible for EFA users with many
experiences.
7. Diagram
For EFA results, the fundamental visualizations is plotting the
relationship between factors and indicators.
In Diagram
tab, the
path diagram representation is provided by using psych
R package
(Revelle, 2017).
It has the structure that all factors and
indicators are represented as a bigger or smaller node, and all loadings
with absolute values greater than some thresholds (e.g. 0.3) are
represented as a line.
Through the graphical representations with
flexible plotting options, users can easily understand the factor
structure.
8. Factor Loadings
In Factor Loadings
tab, EFAshiny provides useful visualization of
factor loadings to facilitate proper interpretations of extracted
factors.
- Bootstrapping Factor Loadings: A table of EFA loadings is
presented graphically. Loadings are represented as a bar and
conditioned on one or more factors. In order to enhance the
interpretability at a glance, positive loadings and negative
loadings are presented by different colors. The greater the loadings
the deeper the color. Confidence intervals of factor loadings are
visualizedto provide quick and useful understanding.
- Factor Loadings and Correlation Matrix: The plot includes the
original correlation matrix of the dataset and a stacked bar-graph
of the factor loadings is provided for users to make an esay
comparison.
- SE and Factor Loadings: The plot visualizes the issue, which
indicates oblique CF-varimax and oblique CF-quartimax rotation
produced similar point estimates but different standard error
estimates (Zhang & Preacher, 2015), by presenting comparison figure.
Users can observe whether the phenomenon exists in their empirical
dataset.
In addition to providing a table of loadings for EFA results, users can
automatically get the whole picture of the EFA results through these
visualizations.
9. Summarized Steps
We summarize, in six concrete steps, our provided flow in EFAshiny
for
performing EFA.
- Read the data and review it on the main console. Select which
variable should be included in further analysis.
- Explore the data. For each item, users can examine its numeric
statistic, distributions, and correlation patterns.
- Use multiple criteria to determine the number of factors.
- Perform EFA. Input the number of factors that decided in step 3. The
table of EFA results will be presented, including loadings,
confidence interval and correlations between factors.
- Visualize the results. Three kinds of plots are shown by EFAshiny.
Get a general idea of the results from these visualization.
- Download and use the results, including figures and tables, in every
step for any purpose.
To see the tutorial in vignettes:
browseVignettes("EFAshiny")
By following this analysis flow in EFAshiny
, users without any
knowledge of programming are able to perform EFA and obtain great
understandings for their own studies.
10. R Code for the Github version
In addition to the GUI, we also provide an Editor
tab with several
code demonstrations in the Github version of EFAshiny
. In this
Editor
mode (see figure below), we already present some quick examples
allowing users to perform similar analyses in EFAshiny
GUI. Users can
also write their own R code here. With this feature users might have the
possibility to use EFAshiny
within a script pipeline. In general, this
cool feature allow users to learn R, understand the code underlying
analyses in EFAshiny
or automate the analyses in the future.
Note that this feature can also allow the use of lavaan
R package to
perform confirmatory factor analysis (CFA), which is also a widely used
method but not the main focus of EFAshiny
. Simply input
require(lavaan)
should work (see lavaan
package for details). Another useful tool is
the showcase
version of shiny
when running the APP ( definitely, you
can directly see the code in server.R
and ui.R
).
In summary, Users who want to further understand EFAshiny
or learn R
can (1) see the code in Editor
tab of github version EFAshiny
GUI
(as shown in figure), (2) download the R markdown file similar to the
code in editor mode
here,
(3) see the same R markdown file in this public
link, (4) use showcase
function in shiny
, and (5) directly see the code in server.R
and
ui.R
.
Data
The dataset for demonstration is the 10-items Rosenberg Self-Esteem
Scale (RSE; Rosenberg, 1965) via an online platform for psychological
research. The RSE was
recorded in 1 to 4 Likert scale, where higher scores indicated higher
agreements for the items (1=strongly disagree, 2=disagree, 3=agree, and
4=strongly agree). Previous studies suggested that the RSE could be
treat as a one factor un-dimensional scale, which simply assessed a
positive self-evaluation construct, or a two factor bi-dimensional
scale, where one factor is proposed to assess positive self-esteem
(e.g. I feel that I have a number of good qualities) with another
measuring negative self-esteem (e.g. At times I think I am no good at
all). EFAshiny
already implements a 256 participants RSE data as a
built-in dataset, but
RSE.csv
with
codebook
can also be directly
downloaded.
Dependencies
bootnet
(Epskamp,
2017)
corrplot
(Taiyun & Viliam,
2017)
EFAutilities
(See Zhang, 2014 for
detail)
reshape2
(Wickham, 2014)
EGA
(Golino & Epskamp,
2017)
ggplot2
(Wickham,
2016)
ggcorrplot
(Kassambara,
2016)
gridExtra
(Auguie,
2017)
igraph
(Csardi & Nepusz,
2006)
moments
(Komsta & Novomestky, 2013)
plotly
(Sievert, et al., 2017)
psych
(Revelle,
2017)
psycho
(Makowski,
2018)
qgraph
(Epskamp, et al., 2012)
shiny
(Chang, Cheng, Allaire, Xie, & McPherson,
2017)
shinytheme
(Chang, 2016)
References
- Auguie, B. (2017). gridExtra: Miscellaneous Functions for" Grid"
Graphics, 2016. R package version, 2.3.
- Bartholomew, D.J., Knott, M., Irini Moustaki, I. (2011). Latent
Variable Models and Factor Analysis. A Unified Approach. Wiley.
- Cattell, R. B. (1966). The scree test for the number of factors.
Multivar Behav Res, 1(2), 245-276.
- Chang, W. (2016). shinythemes: Themes for Shiny. R package version
1.1.1.
- Chang, W., Cheng, J., Allaire, J. J., Xie, Y., & McPherson, J.
(2017). shiny: Web application framework for R. R package version
1.0.0.
- Csardi, G., & Nepusz, T. (2006). The igraph software package for
complex network research. InterJournal, Complex Systems, 1695(5),
1-9.
- Epskamp, S., Cramer, A. O. J., Waldorp, L.J., Schmittmann, V.D., &
Borsboom, D. (2012). qgraph: Network Visualizations of Relationships
in Psychometric Data. Journal of Statistical Software, 48(4), 1-18.
- Epskamp, S. (2017). bootnet: Bootstrap methods for various network
estimation routines. R package version 1.0.1
- Golino, H. F., & Epskamp, S. (2017). Exploratory graph analysis: A
new approach for estimating the number of dimensions in
psychological research. PloS one, 12(6), e0174035.
- Henson, R. K., & Roberts, J. K. (2006). Use of exploratory factor
analysis in published research: Common errors and some comment on
improved practice. Educational and Psychological measurement, 66(3),
393-416.
- Horn, J. L. (1965). A rationale and test for the number of factors
in factor analysis. Psychometrika, 30(2), 179-185.
- Komsta, L., & Novomestky, F. (2013). moments: moments, cumulants,
skewness, kurtosis and related tests. R package version 0.13.
- Kassambara, A. (2016). ggcorrplot: Visualization of a Correlation
Matrix using’ggplot2’. R package version 0.1.1.
- Yves Rosseel (2012). lavaan: An R Package for Structural Equation
Modeling. Journal of Statistical Software, 48(2), 1-36.
- Makowski, (2018). The psycho Package: an Efficient and
Publishing-Oriented Workflow for Psychological Science. Journal of
Open Source Software, 3(22), 470.
- McLaughlin, T., Steinberg, B., Christensen, B., Law, I., Parving,
A., & Friberg, L. (1992). Potential language and attentional
networks revealed through factor analysis of rCBF data measured with
SPECT. Journal of Cerebral Blood Flow & Metabolism, 12(4), 535-545.
- Revelle, W. (2017) psych: Procedures for Personality and
Psychological Research, Northwestern University, Evanston, Illinois,
USA, R package version 1.7.8.
- Rosenberg, M. (1965). Rosenberg self-esteem scale (RSE). Acceptance
and commitment therapy. Measures package, 61, 52.
- Sievert, C., Parmer, C., Hocking, T., Chamberlain, S., Ram, K.,
Corvellec, M., & Despouy, P. (2016). plotly: Create Interactive Web
Graphics via ‘plotly. js’. R package version, 4.7.1.
- Taiyun Wei and Viliam Simko (2017). R package “corrplot”:
Visualization of a Correlation Matrix. R package version 0.84.
- Velicer, W. F. (1976). Determining the number of components from the
matrix of partial correlations. Psychometrika, 41(3), 321-327.
- Wickham, H. (2016). reshape2: Flexibly Reshape Data: A Reboot of the
Reshape Package. R package version 1.4.2.
- Wickham, H. (2016). ggplot2: elegant graphics for data analysis.
Springer.
- Zhang, G., & Preacher, K. J. (2015). Factor rotation and standard
errors in exploratory factor analysis. Journal of Educational and
Behavioral Statistics, 40(6), 579-603.
- Zhang, G. (2014). Estimating standard errors in exploratory factor
analysis. Multivariate Behavioral Research, 49, 339-353.
Authors
Chi-Lin Yu : Department of
Psychology, National Taiwan University, Taiwan
Ching-Fan
Sheu : Institute of Education,
National Cheng Kung University, Taiwan
If you have a
question, comment, concern or code contribution about EFAshiny
, please
send us an email at psychilinyu@gmail.com.
How To Cite
Please cite as:
- Yu, C.-L., & Sheu, C.-F. (2018). EFAshiny: An User-Friendly Shiny
Application for Exploratory Factor Analysis. Journal of Open Source
Software, 3(22), 567-568, https://doi.org/10.21105/joss.00567.