graphistry / pygraphistry

PyGraphistry is a Python library to quickly load, shape, embed, and explore big graphs with the GPU-accelerated Graphistry visual graph analyzer
BSD 3-Clause "New" or "Revised" License
2.16k stars 206 forks source link

[BUG] categorical color encoding does not allow two values to have the same color #549

Open DataBoyTX opened 8 months ago

DataBoyTX commented 8 months ago

Describe the bug

If I use a categorical color encoding and try to set two values to have the same color, only one of the them uses the color, the other gets set to the default color

To Reproduce

import datetime 
import pandas as pd
import numpy as np 
import graphistry
import math 

graphistry.register(api=3, personal_key_id='', personal_key_secret='', server='...') 

num_recs=353
num_edges=787

ndf = pd.DataFrame({'ID' : range(num_recs),
                   'tier' : np.random.randint(1,9,num_recs),
                   'FL_risk_exposure' : np.random.randint(0,100,num_recs),
                   'risk_level' : [np.random.choice(['no_risk', 'high_risk', 'med_risk']) for i in range(num_recs)], 
                   'flagged_company' : [np.random.choice(['Yes', 'No']) for i in range(num_recs)], 
                   'rating' : np.random.randint(0,10,num_recs)})

edf = pd.DataFrame({'source' : np.random.choice(10,num_edges),
                    'target' : np.random.choice(10,num_edges)})

tier2len = ndf.tier.value_counts().to_dict()

ndf['x'] = ndf.apply(lambda row: (row['tier']) * math.cos(2*math.pi * row['ID']/tier2len[row['tier']] ), axis=1)
ndf['y'] = ndf.apply(lambda row: (row['tier']) * math.sin(2*math.pi * row['ID']/tier2len[row['tier']]), axis=1)

g3 = (graphistry.addStyle(bg={'color': '#F2F7F8'})
                .nodes(ndf, 'ID')
                .edges(edf, 'source', 'target')
                .settings(url_params={'play': 0,'pointSize': .3,'edgeOpacity': .1}, height=800)
                .encode_axis([
                    {'r': 1, 'space': True, "label": "Tier 1"},
                    {'r': 2, 'space': True, "label": "Tier 2"},
                    {'r': 3, 'space': True, "label": "Tier 3"},
                    {'r': 4, 'space': True, "label": "Tier 4"},
                    {'r': 5, 'space': True, "label": "Tier 5"},
                    {'r': 6, 'space': True, "label": "Tier 6"},
                    {'r': 7, 'space': True, "label": "Tier 7"},
                    {'r': 8, 'space': True, "label": "Tier 8"},
                    {'r': 9, 'space': True, "label": "Tier 9"}])
                .encode_point_color('risk_level', as_categorical=True,
                                    categorical_mapping={
                    # 'no_risk' : 'black',
                    'high_risk' : 'red',
                    'med_risk' : 'red'}, default_mapping='black')
     )

g3.plot()

Expected behavior expect med_risk to be red, but it is showing black

Screenshots image

pygraphistry v0.33.2

lmeyerov commented 8 months ago

I'll do some quick research

Agreed, can be a fun one for @mj3cheun

lmeyerov commented 8 months ago

OK confirmed, looking like a straightforward streamgl-viz backend fix