kassambara / factoextra

Extract and Visualize the Results of Multivariate Data Analyses
http://www.sthda.com/english/rpkgs/factoextra
352 stars 102 forks source link

Possible extension of factoextra to include ExPosition #23

Open kassambara opened 7 years ago

kassambara commented 7 years ago

From private e-mail:

CA, MCA, PCA, MDS (ExPosition, InPosition for inferences) Discriminant Correspondence Analysis (DiCA), PLSC, (TExPosition, TInPosition for inferences), barycentrique discriminant analysis MFA, STATIS and variants (MExPosition)

kassambara commented 7 years ago

Now, factoextra R functions have been simplified to make easy the extension of factoextra to other packages.

If you want to extend the existing factoextra functions to other packages, you need to update only the following functions :

  1. .get_facto_class()
  2. get_eig()
  3. one of these functions depending on the type of the analysis: get_pca(), get_ca(), get_mca(), get_famd(), get_mfa() or get_hmfa()

Additionally,

If you want to add a new functionality, you need to update the functions below:

For example, to extend the factoextra package to handle the results of the functions epCA(), epMCA() and epMCA() [in the ExPosition Package], we used the following R scipt:

# Added in .get_facto_class()
#%%%%%%%%%%%%%%%%
if (inherits(X, "expoOutput")){
   if (inherits(X$ExPosition.Data,'epCA')) facto_class="CA"
   else if (inherits(X$ExPosition.Data,'epPCA')) facto_class="PCA"
   else if (inherits(X$ExPosition.Data,'epMCA')) facto_class="MCA"
  }

# epCA: Correspondence Analysis
#%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
library(ExPosition)
data(authors)
res.ca <- epCA(authors$ca$data, graph = FALSE)
res <- res.ca$ExPosition.Data
# eigenvalues (used in get_eig())
eig <- res$eigs
# Results for rows (used in get_ca_row())
coord <- res$fi
inertia <- res$di*res$M
cos2 <- res$ri
contrib <- res$ci*100
mass <- res$M # .get_ca_mass() 
# Results for columns (used in get_ca_col())
coord <- res$fj
inertia <- res$dj*res$W
cos2 <- res$rj
contrib <- res$cj*100
mass <- res$w # .get_ca_mass()

# epMCA: Multiple Correspondence Analysis
#%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
library(ExPosition)
data(mca.wine)
res.mca <- epMCA(mca.wine$data, graph = FALSE)
res <- res.mca$ExPosition.Data
# eigenvalues (used in get_eig())
eig <- res$eigs
# Results for rows/individuals (used in get_mca_ind())
coord <- res$fi
inertia <- res$di*res$M
cos2 <- res$ri
contrib <- res$ci*100
mass <- res$M # .get_ca_mass() 
# Results for columns/variables (used in get_mca_var())
coord <- res$fj
inertia <- res$dj*res$W
cos2 <- res$rj
contrib <- res$cj*100
mass <- res$w # .get_ca_mass()

#epPCA
#%%%%%%%%%%%%%%%%%%%%%%%%%%%%
library(ExPosition)
res.pca <- epPCA(iris[, -5], graph=FALSE)
res <- res.pca$ExPosition.Data
# Eigenvalues (used in get_eig())
res$eigs # Valeurs très élévée par rapport à FactoMineR::PCA(iris[,-5], graph=FALSE)$eig
# Results for individuals (used in get_pca_ind())
ind.coord <- res$fi
ind.contrib <- res$ci*100
ind.cos2 <- res$ri
# Results for variables (used in get_pca_var())
var.coord <- var.cor <- cor(res$X, res$fi) # cor(t(data_matrix), factor_scores)
var.contrib <- res$cj*100
var.cos2 <- res$rj
kassambara commented 7 years ago

Support for ExPosition::epCA() added:

library(ExPosition)
data(authors)
res.ca <- epCA(authors$ca$data)

# Visualize
library(factoextra)
fviz_ca_biplot(res.ca)

ca-biplot

kassambara commented 7 years ago

Support for ExPosition::epPCA() added:

library(ExPosition)
res.pca <- epPCA(iris[, -5], graph=FALSE)

library(factoextra)
fviz_pca_ind(res.pca,
             label = "none", # hide individual labels
             habillage = iris$Species, # color by groups
             addEllipses = TRUE # Concentration ellipses
)

rplot08

kassambara commented 7 years ago

Support for ExPosition::epMCA() added:

# Compute MCA
library(ExPosition)
data(mca.wine)
res.mca <- epMCA(mca.wine$data, graph = FALSE)
``

- Load factoextra for visualization

```r
library(factoextra)
library(factoextra)
fviz_mca_ind(res.mca, repel = TRUE)

rplot09

fviz_mca_var(res.mca, repel = TRUE )

rplot10

fviz_mca_biplot(res.mca, repel = TRUE)

rplot11