series (by color, linetype or shape) only use subset of available levels

davidswelt commented 8 years ago

When using an additional color, linetype or shape for a category, only a small number of the actual categories are used for my dataset. That is unexpected and different from ggplot2 in R, which works as documented (but prints a warning that the Blues palette has only 9 colors). scale_color_brewer is not documented to be limited, and other palettes do not seem to improve the situation. The documentation for scale_color_brewer shows examples with 8 series.

Python 3.5.1, ggplot 0.6.8

from pandas import *
from ggplot import *
t = read_csv("city-speed.csv")   # I am not allowed to attach my csv file here...
t.Loc2 = t.Loc2.astype(str)  # 'category' will yield "unorderable types" error when plotting!  
print(t.Loc2.cat.categories.size)
t.groupby('Loc2').count()
#27 levels are available for Loc2
ggplot(aes(x='Time', y='Speed', color='Loc2'), data=t) + geom_point() + scale_colour_brewer()
#3 values are used fro Loc2
ggplot(aes(x='Time', y='Speed', linetype='Loc2'), data=t) + geom_point()
# four values are assumed for Loc2
ggplot(aes(x='Time', y='Speed', shape='Loc2'), data=t) + geom_point()
# nine values

glamp commented 8 years ago

Hey thanks for submitting an issue. Can you provide a reproducible example?

davidswelt commented 8 years ago

city-speed.csv is attached. It should reproduce with that.

city-speed.csv.zip

yhat / ggpy

series (by color, linetype or shape) only use subset of available levels #481