MaxHalford / prince

:crown: Multivariate exploratory data analysis in Python — PCA, CA, MCA, MFA, FAMD, GPA
https://maxhalford.github.io/prince
MIT License
1.27k stars 184 forks source link

Deciding the n_components in FAMD or MFA by eigenvectors that capture at least 95%-99% of the total variance? #125

Closed monk1337 closed 1 year ago

monk1337 commented 2 years ago

Hello price developers, I was looking a way to decide n_components in FAMD or MFA automatically rather than giving manual input for parameters. Usually, in PCA we select those eigenvectors that capture at least 95%-99% of the total variance, can we do the same for FAMD or MFA?

I want to pull a request to contribute to this feature.

b-thebest commented 2 years ago

I also pointed out this thing. lIt takes number of components as number of samples

jeanbaptisteb commented 2 years ago

Here's a literature review of various methods to select the number of factors to retain (see section 5): https://www.researchgate.net/profile/M-Shabri-Abd-Majid/post/How-can-we-determine-the-sample-size-from-an-unknown-population/attachment/5f1f2f9f4b30fd0001537310/AS%3A917998588153856%401595879327045/download/gaskin2014-sem.pdf#page=4

Parallel analysis would probably be interesting to implement:

Several methods of determining the number of factors to retain have been presented in the literature, including Bartlett’s (1951, 1950) test, Kaiser’s (1960) eigenvalues greater than one rule, Cattell’s (1966) scree test, Velicer’s (1976a) minimum average partial (MAP) rule, Horn’s (1965) parallel analysis, the Hull method (Lorenzo-Seva et al., 2011), and Ruscio and Roche’s (2012) comparison data (CD). Parallel analysis has consistently been shown to have higher levels of accuracy than Bartlett’s test, the eigenvalues greater than one rule, and the scree test (Hubbard and Allen, 1987; Velicer et al., 2000; Zwick and Velicer, 1986). Since the development of parallel analysis, several improvements to this method have been proposed (e.g., Crawford et al., 2010; Glorfeld, 1995).

MaxHalford commented 1 year ago

Hello there 👋

I apologise for not answering earlier. I was not maintaining Prince anymore. However, I have just refactored the entire codebase. This refactoring should have fixed many bugs.

I don’t have time and energy to check if this fixes your issue, but there is a good chance it does. Feel free to reopen this issue if the problem persists after installing the new version — that is, version 0.8.0 and onwards.