erdogant / pca

pca: A Python Package for Principal Component Analysis.
https://erdogant.github.io/pca
MIT License
284 stars 42 forks source link

request: pass column labels explicitly to biplot #7

Closed jcpeterson closed 3 years ago

jcpeterson commented 3 years ago

such as when a numpy matrix instead of pandas dataframe is given

erdogant commented 3 years ago

Thank you for your input! Can you give an example where/how the DataFrame is given as an input?

jcpeterson commented 3 years ago

Maybe I just haven't gotten the usage down. When I passed a numpy matrix, projections of the original dimensions had no labels, and I wasn't able to figure out how to supply them. Passing in my matrix as a dataframe worked, since it has column names.

erdogant commented 3 years ago

Below is an example with a Dataframe as input and separately Numpy array with labels as input. Is this what you mean?

from pca import pca

# Initialize
model = pca(alpha=0.05, n_std=2)

# Example with Numpy array
X = np.array(np.random.normal(0, 1, 500)).reshape(100, 5)
row_labels = np.arange(0, X.shape[0]).astype(str)
# Fit transform
out = model.fit_transform(X, row_labels=row_labels)
# Make plot
model.biplot(legend=False)

# Example with DataFrame
X = pd.DataFrame(data=X, columns=np.arange(0, X.shape[1]).astype(str))
# Fit transform
out = model.fit_transform(X)
# Make plot
model.biplot(legend=False)
jcpeterson commented 3 years ago

Hmm ok, I see, thanks