mdaines / viz-js

Graphviz in your browser
https://viz-js.com/
MIT License
4.06k stars 350 forks source link

Adding gvmap into the package #156

Closed dsl101 closed 5 years ago

dsl101 commented 6 years ago

I'm desperately seeking to replace an antiquated command-line app with a nice web app, and this gets me 50% of the way there, but it also relies on gvmap to turn the output of running, say, sfdp into coloured clusters. In fact, the current script runs the data through fpd / sfdp / neato, pipes the output to gvmap, then pipes that to neato again. The full script does something like this:

cat file.dot | sfdp -Goverlap=prism -Goutputorder=edgesfirst -Gsize=10,10! | gvmap -s -5 -e | neato -Ecolor="invis" -Goutputorder=edgesfirst -Gsize=10,10! -n2 -T png

Is there any chance viz.js could support this? Or is there an alternative you know of which provides gvmap like functionality?

StoneCypher commented 5 years ago

whoa, i've never even heard of gvmap, and i've been using graphviz for 15 years

is it possible to show an example of the desired output?

dsl101 commented 5 years ago

Yes, it takes what you might normally get from a sfdp output and produces things like this. Obviously this one is just a test with our team, based on share interests, but you can imagine when there are 30–40 people, with many individual and shared 'tags', it's a nice way to visualise the clusters within a group.

neato

StoneCypher commented 5 years ago

the actual desire isn't clear to me here. what's stopping you from using a simple voronoi implementation and just doing the clustering yourself with k-means?

i confess those weird borders actually don't look very good to me

dsl101 commented 5 years ago

My main desire is not to have to do my own implementation :). Gvmap gave us pretty much what we wanted but now I need something in the browser. But I’m not wedded to this and if there’s an alternative library you know of that can do this, I’m all ears :)

On Sat, 8 Dec 2018 at 04:24, John Haugeland notifications@github.com wrote:

the actual desire isn't clear to me here. what's stopping you from using a simple voronoi implementation and just doing the clustering yourself with k-means?

i confess those weird borders actually don't look very good to me

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mdaines/viz.js/issues/156#issuecomment-445429577, or mute the thread https://github.com/notifications/unsubscribe-auth/AA29vpWA5yAb14E246CkXL7WP2sBX62Mks5u2z8VgaJpZM4WevXh .

mdaines commented 5 years ago

This looks interesting, but I'm probably not going to add gvmap to Viz.js. It should be possible to cross-compile the gvmap command using Emscripten, however.

StoneCypher commented 5 years ago

@dsl101 - there are a literal million of them, and I would need to better understand your needs before I could recommend.

If all you need is a voronoi renderer, I chose one ten years ago and I still use it by habit. There's probably a better choice by now but I've never bothered to go find it. I use RHill's voronoi core from his voronoi package. I kind of skip the rest of it because I'm a hipster and ain't nobody got time for that.

It's an implementation of the Fortune algorithm, so it'll generate reasonable layouts, give the delaunay, and it's not going to win any speed contests (though it's realtime on datasets under a thousand points on modern machines.)

It draws to canvas and it's so old that it works in IE5. Amazingly, it also works everywhere today.

It doesn't generate super gorgeous results, but I think they're better than the ones above.

I don't really get your clustering, to be honest. All I know is that a clustering happened and that that's the output. That's like saying "i'm here and I took a vehicle." Okay, was it an airplane, a car, a moose?

I could see that being a k-means clustering, and k-means is pretty straightforward in some libraries.

If that isn't what you need, please start by helping me understand what you actually need. I'm question mark as fuck over here

StoneCypher commented 5 years ago

This is what the results look like

dsl101 commented 5 years ago

Thanks @StoneCypher. I will take a look. The data we feed to sfdp / neato is just a 'distance' measure between each pair of people, based on the number of shared interests they have. In the dot file, the people are the nodes, and each pair has an edge.

If there are, say, 20 interest tags in total, then a pair sharing all 20 would have distance 1, and a pair sharing no interests would have distance 20. There's lots of hand-waving and assumptions made about it, and we're not looking for anything scientific (interests are self-reported, so of course it's hugely influenced by how many interests someone picks).

The intention is to give the group (typically 20-30 people) a talking point by projecting the resulting 'territory map', and asking them to figure out (a) why they are in their group with the other people, and (b) what defines their territory in terms of subject matter. So to some extent, it doesn't matter exactly what kind of diagram we project, providing people can interpret it quickly, and make some connection with the people they've been 'grouped' with.

I guess it's a good time to have another look round for a simpler solution than our current one, so thanks again for your suggestions.

StoneCypher commented 5 years ago

Yeah I don't think clustering is the right tool. Just by changing the number of dimensions in your cluster space, you're going to get suddenly very different results.

If what you want is landscape position and participation, here's something that's numerically a little cleaner, and also a whole lot easier to implement, and which will also give visually more appealing results:

  1. Make a fixed list of interest topics, N topics long
  2. Have people rate themselves on [-2 disinterest ... +2 interest] scale for each topic
  3. Declare a metric space of N dimensions
  4. Use N-dimensional euclidean distance to measure the "distance between" D every pair of people
  5. Look at the closest 5 and furthest 5 D as your tribe sets

Facetiously:


const topics = ['military', 'health', 'science', 'infrastructure', 'low taxes', 'lizardpeople', 'aliens', 'illuminati', 'mind control'];

const candidates = [

  { name: 'abe',    positions: [2,  0,  1, -1, -1, 0, 0, 0, 0] },
  { name: 'dwight', positions: [1, -1,  1,  2, -1, 0, 0, 0, 0] },
  { name: 'ike',    positions: [2,  1,  1,  2, -1, 0, 0, 0, 0] },
  { name: 'ronald', positions: [2, -2, -1,  1,  2, 0, 0, 0, 1] },
  { name: 'george', positions: [1, -2, -1,  0, -1, 0, 0, 1, 0] },
  { name: 'bill',   positions: [1, -1,  0,  0,  0, 0, 1, 0, 0] },
  { name: 'w',      positions: [2, -2, -2, -1, -2, 0, 0, 1, 1] },
  { name: 'barack', positions: [1,  0,  1, -1, -2, 1, 0, 0, 0] }

];

const dists = [];

const sum = arr => arr.reduce( (acc, next) => acc+next, 0 );

const n_euc = (l,r) => Math.sqrt(sum(l.map( (ul,i) => Math.abs(ul - r[i]) )
                                      .map(el => el * el)
                                )   );

const ci = candidates.length;

for (var left=0; left<ci; ++left) {
  for (var right=0; right<left; ++right) {
    const distance = n_euc(candidates[left].positions, candidates[right].positions);
    dists.push({ left: candidates[left].name, right: candidates[right].name, distance });
  }
}

console.log(dists);

Getting the top and bottom N from that for a given person is trivial

dsl101 commented 5 years ago

Tx, and weird that you use the very same interest list we do... :).

StoneCypher commented 5 years ago

we all want to say barack

we all want to say lizardpeople

not that many other options