Open soroushysfi opened 5 years ago
Sounds good.
Edge Bundled For edge bundled we need this sort of structure(the names used are from a sample I found, not the data we're using):
[
{"name":"flare.analytics.cluster.AgglomerativeCluster","size":3938,"imports":["flare.animate.Transitioner","flare.vis.data.DataList","flare.util.math.IMatrix","flare.analytics.cluster.MergeEdge","flare.analytics.cluster.HierarchicalCluster","flare.vis.data.Data"]},
{"name":"flare.analytics.cluster.CommunityStructure","size":3812,"imports":["flare.analytics.cluster.HierarchicalCluster","flare.animate.Transitioner","flare.vis.data.DataList","flare.analytics.cluster.MergeEdge","flare.util.math.IMatrix"]}
...
]
What they did here is that they added the cluster to the start of the nodes names, e.g., flare.analytics.cluster is the cluster and AgglomerativeCluster is the name of the node. import property is the outgoing edges(we won't be needing ingoing edges for every node because if we include all the outgoing edges the ingoing ones will be counted). Or if a list separately provided mentioning the clusters for nodes that would be fine too. I would aggregate it myself.
Chord Diagram Chords diagram uses simple matrix input. like a 2d array:
[
[1, 2, 0, 4, 6],
[4, 5, 3, 0, 1],
[4, 5, 3, 0, 1],
[4, 5, 3, 0, 1],
[4, 5, 3, 0, 1],
]
This should be a square matrix. Also a separate list showing the name of each node:
["node1", "node2", "node3", ...]
Matrix and Node-Link Diagram For both of these diagrams a json containing the nodes and links would do the job: { nodes:[ {"name": "node1", "group": 0}, {"name": "node2", "group": 0}, {"name": "node3", "group": 0}, ... ], links:[ {"source": "1", "target": "0", "value": 166}, {"source": "2", "target": "0", "value": 181}, {"source": "3", "target": "0", "value": 79}, {"source": "4", "target": "0", "value": 3}, ... ] }
The group and value properties are optional. We can assign properties from data like average word count to link value. If we have the clustering we can add them to groups.
Is there a link to this information?
Is there a link to this information?
Yes. This is how generally d3.js works. It needs the data to be in a specific format to visualize them. Edge Bundled Chord Diagram Matrix Node-Link
I will set up flask server tomorrow.
For the edge bundled graph I have no ideas on how we can put our data in that format. Could you try to run like a mock up maybe with some sample data that may look like ours?
I'm running a page rank type of thing and it's literally taken like half an hour so far.
--- Eigenvector Centrality --- subreddit score 0 iama 321.911086 1 askreddit 311.323265 2 pics 263.799529 3 funny 258.084657 4 videos 253.007776 5 todayilearned 231.063511 6 worldnews 184.428278 7 gaming 182.708345 8 news 180.482415 9 gifs 165.655387
--- Eigenvector Centrality --- subreddit score 0 iama 321.911086 1 askreddit 311.323265 2 pics 263.799529 3 funny 258.084657 4 videos 253.007776 5 todayilearned 231.063511 6 worldnews 184.428278 7 gaming 182.708345 8 news 180.482415 9 gifs 165.655387
wow, that is impressive, could you explain a little bit more what those scores represent?
Eigenvector Centrality is an algorithm that measures the transitive influence or connectivity of nodes.
Relationships to high-scoring nodes contribute more to the score of a node than connections to low-scoring nodes. A high score means that a node is connected to other nodes that have high scores.
https://neo4j.com/docs/graph-algorithms/current/labs-algorithms/eigenvector-centrality/
I'm just learning about it now.
Cool!
--- Eigenvector Centrality --- subreddit score 0 iama 321.911086 1 askreddit 311.323265 2 pics 263.799529 3 funny 258.084657 4 videos 253.007776 5 todayilearned 231.063511 6 worldnews 184.428278 7 gaming 182.708345 8 news 180.482415 9 gifs 165.655387
Cool! The strings are nodes names? what are the numbers?
Eigenvector Centrality is an algorithm that measures the transitive influence or connectivity of nodes. Relationships to high-scoring nodes contribute more to the score of a node than connections to low-scoring nodes. A high score means that a node is connected to other nodes that have high scores.
https://neo4j.com/docs/graph-algorithms/current/labs-algorithms/eigenvector-centrality/
Clustering To visualize the data we should do some clustering or down sampling. We could use some algorithms like K-means or t-sne(which is a machine learning algorithm for dimension reduction). We don't need to implement it I found some links that we can import them in our project and use them.
K-means: http://benalexkeen.com/k-means-clustering-in-python/ T-sne: https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html
Down sampling For down sampling we can use the methods or combination of these:
I will update this issue to indicate what format I will need the data to visualize them.