Open pstumps opened 5 years ago
It wants a numpy structured array, which is, perhaps a trickier thing to deal with. It is certainly possible to reverse engineer the appropriate structured array, but it is not trivial. Fortunately I believe the pandas column names are the relevant structured array record names, so that provides a start. You'll want to look at the structured array documentation or this tutorial and figure out how to construct the right thing.
Thanks a lot for getting back to me, it's much appreciated. I read through the links you sent me and I believe I was able to get the data in a structured array by using pandas to_records()
method.
I assume this is also non-trivial, but is it possible to reverse engineer the linkage matrix from these data?
I believe at best you can only recover an approximation of the original linkage data from the condensed tree -- some information was lost along the way. The smaller your min_cluster_size
the less information was lost.
Hello, I am attempting to create a dendrogram from previously run data. I have generated a Condensed Tree and converted it to a dataframe using
to_pandas()
method. I saved that data as a .csv file and I can no longer re-perform my original clustering to generate the condensed tree. I have attempted to initialize these data as a condensed tree by first converting the .csv data into a numpy array (which is the format I believe the Condensed Tree object is accepted), then inputting it inhdbscan.plots.CondensedTree(data)
however this does not seem to be working as I receive this error when usingplot()
:Is there a way to "reverse generate" a dendrogram from this data? How does
plot()
accept the condensed tree object?