rasbt / mlxtend

A library of extension and helper modules for Python's data analysis and machine learning libraries.
https://rasbt.github.io/mlxtend/
Other
4.82k stars 853 forks source link

Use SciPy SVD instead of NumPy SVD. #1078

Closed rasbt closed 3 months ago

rasbt commented 5 months ago

As noted here NumPy's SVD can occasionally be incorrect, and it is better to use scipy.linalg.svd(cov, lapack_driver='gesvd') instead.

SVD is used in the PCA class here: https://github.com/rasbt/mlxtend/blob/master/mlxtend/feature_extraction/principal_component_analysis.py

fkdosilovic commented 5 months ago

There are potential performance benefits as well. From scipy's documentation:

... advantage of using scipy.linalg over numpy.linalg is that it is always compiled with BLAS/LAPACK support, while for NumPy this is optional. Therefore, the SciPy version might be faster depending on how NumPy was installed.

So the class could also benefit from using the scipy's eigendecomposition.