Closed Rendiere closed 4 years ago
Dear Rendiere,
Its great to read your enthusiasm using the pca library! Thank you for pointing out these issues. I fixed the length of the arrow, number of features and included some of the missing docstrings.
Update the library with: pip install -U pca The version should be >= 1.07. Check the version with:
import pca
pca.__version__
from pca import pca
import pandas as pd
model = pca(normalize=True)
# Dataset
df = pd.read_csv('usarrest.txt')
# Setup dataset
X = df[['Murder','Assault','UrbanPop','Rape']].astype(float)
X.index = df['state'].values
# Fit transform
out = model.fit_transform(X)
out['topfeat']
# Make plot
ax = model.biplot(n_feat=4, legend=False)
ax = model.biplot3d(n_feat=4, legend=False)
Hi @erdogant , thanks for the speedy turnaround! Can confirm that the fixes work for me as well. Love the added colours too.
I'm recreating figure 10.1 from the book Introduction to Statistical Learning with this library, specifically creating a 2-d biplot from the USA Arrests dataset.
However, when creating a biplot only the first 2 loading vectors are displayed irrespective of what I pass to
n_features
. In addition, although a separate issue, the loading vector label is plotted outside the scale of the plot.From a quick scan of the code it looks like the issue is in the
compute_topfeat
method, wheren_feat
never gets taken into account, but rathern_pcs
gets iterated over twice.P.S - great work on this library. Exactly what I was looking for when googling "PCA biplots python". For that reason, I wouldn't mind helping out with maintaining this library if needs be.