kassambara / factoextra

Extract and Visualize the Results of Multivariate Data Analyses
http://www.sthda.com/english/rpkgs/factoextra
355 stars 103 forks source link

Possible Bug at factoExtra package function fviz_mca_ind #47

Open giakoumisEctrics opened 7 years ago

giakoumisEctrics commented 7 years ago

Dear Sir or Madam,

I am currently using your factoextra package for my analysis. However, I think I encountered a bug concerning the fviz_mca_ind function. More specifically, I create 95% confidence ellipses assuming the normal distribution and the Euclidian distance for all individuals that belong to certain groups.

It seems that when I use the Euclidean distance, first the code creates the ellipse taking into consideration all the individuals independent of the group characteristics and then the code produces confidence ellipses that are identical for all groups with the only difference that are shifted with respect to the mean of a given group. This is clearly misleading, as the shape of the ellipse might not only be shifted but also change (become smaller for example etc..)

I look forward to hearing from you.

Kind regards, Giakoumis

kassambara commented 7 years ago

Hi,

This issue is not reproducible on my computer. Please, install the latest developmental version of ggpubr and factoextra:

devtools::install_github("kassambara/ggpubr")
devtools::install_github("kassambara/factoextra")

See the example below:

library(FactoMineR)
library(factoextra)
# Load data
data(poison)
poison.active <- poison[1:55, 5:15]

# MCA
res.mca <- MCA(poison.active, graph = FALSE)

fviz_mca_ind(res.mca, 
             label = "none", # hide individual labels
             habillage = "Vomiting", # color by groups 
             palette = c("#00AFBB", "#E7B800"),
             addEllipses = TRUE, ellipse.type = "norm",
             ggtheme = theme_minimal()) 

rplot

If the issue persists, after installing the latest versions, please post a reproducible R code with a demo data sets, as well as, your devtools::session_info().

giakoumisEctrics commented 7 years ago

Thank you for your reply. I am referring to the case where I set ellipse.type = "euclid". The normal seems to work fine. I look forward to hearing from you.

Kind regards, Giakoumis

kassambara commented 7 years ago

Hi,

Ok, I can see now.

This is a normal behavior, as provided by the function ggplot2::stat_ellipse().

In the documentation of factoextra::fviz and ggplot2::stat_ellipse(), you can find the following explanation :

type = "euclid": draws a circle with the radius equal to level (0.95, 0.66, etc), representing the euclidean distance from the center. This ellipse probably won't appear circular unless coord_fixed() is applied.

This means that type = "euclid" will always produce confidence ellipses that are identical for all groups.

See also:

library(ggplot2)
ggplot(iris, aes(Sepal.Length, Petal.Length))+
  geom_point(aes(color = Species))+
  stat_ellipse(aes(color = Species), type = "euclid", level = 0.95)

rplot15

giakoumisEctrics commented 7 years ago

Hi,

Thank you for the plot provided because this example illustrates why I consider this to be a bug. I understand that overall after drawing all the ellipses 95% of the total observations are left off.

However, to my perception confidence ellipses should be drawn based on the individual groups and not based on the entire sample (especially under the MCA setting). For example, we see that the confidence ellipse for the red group is not 95% but 100% and for the blue group is less than 95%. This is why I regard these ellipses to be misleading.

What I think should be done is to first find the center for each group and then draw the ellipses that are specific for each group using the Euclidian distance.

And to be more specific for your package when ellipse.type = "norm", the output of the ellipses is as expected. (see for example the poison example above), whereas when ellipse.type = "euclid" then the ellipses are identical for all groups making the input setting of the ellipse.type not consistent.

Best, Giakoumis