facebookresearch / hiplot

HiPlot makes understanding high dimensional data easy
https://facebookresearch.github.io/hiplot/
MIT License
2.75k stars 143 forks source link

Multiple parents for same data point in XY graph #31

Open v-i-s-h opened 4 years ago

v-i-s-h commented 4 years ago

The XY graph for capturing the relationship with parents and child data points are really helpful. But most of the time, we create a new child node by combining two (or more) parent nodes. In the current version, I can see

Multiple points can share the same parent.

Is it possible to include multiple parents to the same child so that the "family tree" can be traced during the experiment?

jrapin commented 4 years ago

For now it's not possible, but I would definitely be interested in it too :D

danthe3rd commented 4 years ago

Hi,

It's indeed something we are interested in doing for future versions. My only worry is about readability. If each point has 2 parents, the whole ancestors tree can contain thousands of points in the early generations, which will make the plot hard to read. So there are several ways to do it:

For your use-case, how many datapoints do you have typically, over how many generations?

Thank you for the suggestion :)

v-i-s-h commented 4 years ago

Hi @danthe3rd My experiments are on NeuroEvolution and relatively small, up to 64 generations and population ranging from 32 - 128 candidates. And this could be an interesting tool to analyze one experiment at a time also.

As you rightly pointed out, if all ancestors are highlighted, readability may be an issue. Right now, the highlighting is done as a thick line. What if, with multiple parents, the highlighting it still thick, but as the distance in X-axis increases, the alpha-factor (opacity) is reduced? This way, the immediate parents can be easily visualized and far away ancestors are highlighted with less visual clutter?

danthe3rd commented 4 years ago

Oh that sounds like a good solution! More generically (because the X axis can be arbitrary, and not always increasing), we could divide the alpha factor by the number of parents (or a function of it) - which would preserve the current behavior for cases with 1 parent or less. I'll give it a try in the coming weeks and let you know later :)