EducationalTestingService / factor_analyzer

A Python module to perform exploratory & confirmatory factor analyses.
GNU General Public License v2.0
227 stars 68 forks source link

Loadings matrix has incorrect shape when using principal method with lapack #137

Open LuciaCam opened 2 weeks ago

LuciaCam commented 2 weeks ago

Bug Description When using the principal method with lapack SVD instead of randomized, the loadings matrix returned by FactorAnalyzer is always given in full, it has shape n_cols x n_cols, instead of selecting only loadings for the n_factors desired. When using the randomized SVD, there is no issue.

Reproducible Code

import pandas as pd
import numpy as np

num_rows = 1000
num_cols = 6
df = pd.DataFrame(
    np.random.standard_normal(size=(num_rows, num_cols)), 
    columns=[f'col{i+1}' for i in range(num_cols)])

# shape is correct with randomized
efa = FactorAnalyzer(n_factors=2, rotation='promax', method='principal', svd_method='randomized')
efa.fit(df)
print(efa.loadings_.shape)

# shape is incorrect with lapack
efa = FactorAnalyzer(n_factors=2, rotation='promax', method='principal', svd_method='lapack')
efa.fit(df)
print(efa.loadings_.shape)

Expected behavior The shape of the .loadings_ attribute should be n_cols x n_factors.

Versions (please complete the following information):

desilinguist commented 2 weeks ago

Thanks for your feedback, @LuciaCam. I will look into this.