Exploratoy Factor Analysis of Isolation

Is your feature request related to a problem? Please describe. Factor Analysis (FA) is an exploratory data analysis method used to search influential underlying factors or latent variables from a set of observed variables. It helps in data interpretations by reducing the number of variables. It extracts maximum common variance from all variables and puts them into a common score.

Describe the solution you'd like

It is needed to check this assumptions
- [ ] There are no outliers in data.
- [ ] Sample size should be greater than the factor.
- [ ] There should not be perfect multicollinearity.
- [ ] There should not be homoscedasticity between the variables.
Adequacy Test Before you perform factor analysis, you need to evaluate the “factorability” of our dataset. Factorability means "can we found the factors in the dataset?". There are two methods to check the factorability or sampling adequacy:
- Bartlett’s Test of sphericity checks whether or not the observed variables intercorrelate at all using the observed correlation matrix against the identity matrix. If the test found statistically insignificant, you should not employ a factor analysis. If in this Bartlett ’s test the p-value is 0 the test was statistically significant, indicating that the observed correlation matrix is not an identity matrix.
- Kaiser-Meyer-Olkin Test measures the suitability of data for factor analysis. It determines the adequacy for each observed variable and for the complete model. KMO estimates the proportion of variance among all the observed variable. Lower proportion id more suitable for factor analysis. KMO values range between 0 and 1. Value of KMO less than 0.6 is considered inadequate. A value biger than 0.8 will be excelent and It will means that It's possible to proceed with the planned factor analysis.
Choosing the Number of Factors: Kaiser criterion is an analytical approach, which is based on the more significant proportion of variance explained by factor will be selected. The eigenvalue is a good criterion for determining the number of factors. Generally, an eigenvalue greater than 1 will be considered as selection criteria for the feature. The graphical approach is based on the visual representation of factors' eigenvalues also called scree plot. This scree plot helps us to determine the number of factors where the curve makes an elbow.
Factor Extraction: In this step, the number of factors and approach for extraction selected using variance partitioning methods such as common factor analysis.
Factor Rotation: In this step, rotation tries to convert factors into uncorrelated factors — the main goal of this step to improve the overall interpretability. There are lots of rotation methods but in this case in which It is expected that the variables correlate each other It is recommended use oblique rotation as oblimin direct, promax, orthoblique, procrustes.
Get the variance of the factor

Describe alternatives you've considered

Import required libraries import pandas as pd from sklearn.datasets import load_iris from factor_analyzer import FactorAnalyzer import matplotlib.pyplot as plt
Loading Data
Bartlett’sTest: from factor_analyzer.factor_analyzer import calculate_bartlett_sphericity chi_square_value,p_value=calculate_bartlett_sphericity(df) chi_square_value, p_value
Kaiser-Meyer-Olkin (KMO) Test from factor_analyzer.factor_analyzer import calculate_kmo kmo_all,kmo_model=calculate_kmo(df)
Create factor analysis object and perform factor analysis fa = FactorAnalyzer() fa.analyze(df, 25, rotation=None)
Check Eigenvalues ev, v = fa.get_eigenvalues() ev
Create scree plot using matplotlib plt.scatter(range(1,df.shape[1]+1),ev) plt.plot(range(1,df.shape[1]+1),ev) plt.title('Scree Plot') plt.xlabel('Factors') plt.ylabel('Eigenvalue') plt.grid() plt.show()
Performing Factor Analysis
- Create factor analysis object fa = FactorAnalyzer()
- Perform factor analysis fa.analyze(df, number of factors, rotation="promax")
Get factor variance fa.get_factor_variance()

ateneatla / sexinsa_qol_ELSA

Exploratoy Factor Analysis of Isolation #38