Closed koutoftimer closed 1 year ago
Can you show the plot with your approach?
out20-3.pdf This is how output looks like. No black colors and no color interchanges across all plots.
Invoked somehow like this:
RED = [1., 0., 0.]
GREEN = [0., 1., 0.]
model.biplot(
y=df['target'], # list of 'Y' and 'N' values or any other qualifiers
fixed_colors={'Y': GREEN, 'N': RED},
)
I created an update to manually specify colors.
Here again, can you install first from the github source?
pip install git+https://github.com/erdogant/pca
The input parameters should be straightforward
from sklearn.datasets import load_iris
import pandas as pd
from pca import pca
import matplotlib as mpl
import colourmap
y=load_iris().target
# Initialize
model = pca(n_components=3, normalize=True)
# Dataset
X = pd.DataFrame(data=load_iris().data, columns=load_iris().feature_names, index=y)
# Fit transform
out = model.fit_transform(X)
# plot manually specified colors where c is a list of RGB colors with the same size as the number of samples.
c = colourmap.fromlist(load_iris().target, cmap='Set2')[0]
c[0] = [0,0,0]
model.biplot(c=c, legend=False, label=False)
# In case all class labels are the same, still use the cmap colors if provided.
y1 = np.repeat(0, len(y))
model.biplot(y=y1, cmap=mpl.colors.ListedColormap(['green', 'red', 'blue']))
# Color on classlabel (Unchanged)
model.biplot()
# Use cmap colors for classlabels (unchanged)
model.biplot(y=load_iris().target, cmap=mpl.colors.ListedColormap(['green', 'red', 'blue']))
# Do not show points when cmap=None (unchanged)
model.biplot(y=load_iris().target, cmap=None)
# Plot all points as unique entity (unchanged)
model.biplot(y=None, legend=False, label=False)
IDK, maybe this issue should be closed. I really have no motivation for it right now and looks like you didn't get it.
I'm not sure, but I believe that cmap
is not direct mapping of colors. If you have labels A, B, C, then A will be first color, B - second, etc. If you have labels B, C, D, then B will be first color. This way you are loosing consistency is you want B to be the same color across all the plots.
That is why I'm using fixed_colors
parameter as you can see in the very first message (update part).
Ok I am closing this issue. Note that the input parameter c is to specify the color of each sample individually. Thus the colors can be adjusted exactly how you want now.
Problem
Current approach is kind of okay if you want to distinguish different categories but it doesn't allows consistency across different datasets.
I haven't found a workaround.
UPD: https://github.com/erdogant/pca/compare/master...koutoftimer:pca:master it is not good by any chance, but, at least, it works.