ben-j-barlow / clustree

In development: Visualize clusterings at different resolutions
GNU General Public License v3.0
2 stars 1 forks source link

KeyError: 'res' #7

Closed Jooolioh closed 1 year ago

Jooolioh commented 1 year ago

Hello. I'm sorry to be bothering you. I'm trying to use this library but I can't really figure out what to do. I've tried with my dataset and with the classical iris dataset but I keep getting the error KeyError: 'res' in the _config.py file.

This is the code I'm using to launch clustree:

from sklearn import datasets
data = datasets.load_iris()
X = data.data[:, :2]
y = data.target

data_final = pd.DataFrame()

for k in range(1, 24):
    km = KMeans(
        n_clusters=k, init='random',
        n_init=10, max_iter=300,
        tol=1e-04, random_state=0
    )
    y_km = km.fit_predict(X=X, y=y)
    col = "K" + str(k)
    data_final[col]=y_km

print(data_final.head())

ct = clustree(data=data_final,
              prefix="K",
              draw=True,
              output_path="C:\\Users\\G\\PycharmProjects\\S_T_Error\\ClusTree",
              images="C:\\Users\\G\\PycharmProjects\\S_T_Error\\ClusTree")

This is the full error log:

Traceback (most recent call last):
  File "C:\Users\G\PycharmProjects\S_T_Error\ClusTree\clustreerun.py", line 46, in <module>
    ct = clustree(data=data_final,
  File "C:\Users\G\PycharmProjects\S_T_Error\venv\lib\site-packages\clustree\_graph.py", line 152, in clustree
    config = ClustreeConfig(
  File "C:\Users\G\PycharmProjects\S_T_Error\venv\lib\site-packages\clustree\_config.py", line 73, in __init__
    self.set_node_color(
  File "C:\Users\G\PycharmProjects\S_T_Error\venv\lib\site-packages\clustree\_config.py", line 150, in set_node_color
    self.node_cf[node_id]["node_color"] = mpl.colors.to_rgba(f"C{attr['res']}")
KeyError: 'res'

There is probably something wrong on my side. Could you maybe give me some suggestions? Even a simple snippet with an example of a working run would be great so that I can figure it out on my own, I don't want to bother anyone.

Thank you so much

ben-j-barlow commented 1 year ago

Hey,

Thanks for using the package! It's nice to see it used by organisations beyond the School of Informatics at the University of Edinburgh (who hired me to build it)!

The issue you experienced was because the parameter min_cluster_number defaulted to 1 and you were parsing cluster numbers taking values (0, ..., K-1) [instead of (1, ..., K)], where K is a fixed cluster resolution. To resolve your issue, you could have set min_cluster_number = 0.

Your experience highlighted issues in expectations the user has on the package, so I have released a new version (0.2.1) where min_cluster_number is found automatically. It can also be parsed explicitly if you like too.

Apologies for the delay in getting back to you! I have been in the middle of university exam season.

Thanks

Jooolioh commented 1 year ago

Hey, sorry, I forgot to reply unfortunately.

I wanted to thank you for your help in solving this problem and to thank you once more to have taken the time to create this library.

It has been working great.