Open sbenthall opened 10 years ago
So, I've split mentionball.data_to_network() into two functions: the first (data_to_network) just makes it into a graph and returns the graph, while the second (lookup_metadata(graph)), which I call in main, goes through the edges (as the old version did) and uses lookup_many to get all their metadata, saving some of it (follower count, and now some geo data) in the graph itself.
Do we maybe want to switch that around, so we generate the graph last, after we have some internal data structure containing only the data we want to store?
So step one, get list of users (snowball strategy), step two, collect and clean data (and geocode in here too) for those users, storing it in a dictionary or whatever, step three, change the dict into a graph and output it?
It would be easier to talk about this if we were looking at the same code. Can you share a link to the changes you've been making?
On Thu, Jan 30, 2014 at 3:54 PM, bkfunk notifications@github.com wrote:
So, I've split mentionball.data_to_network() into two functions: the first (data_to_network) just makes it into a graph and returns the graph, while the second (lookup_metadata(graph)), which I call in main, goes through the edges (as the old version did) and uses lookup_many to get all their metadata, saving some of it (follower count, and now some geo data) in the graph itself.
Do we maybe want to switch that around, so we generate the graph last, after we have some internal data structure containing only the data we want to store?
So step one, get list of users (snowball strategy), step two, collect and clean data (and geocode in here too) for those users, storing it in a dictionary or whatever, step three, change the dict into a graph and output it?
Reply to this email directly or view it on GitHubhttps://github.com/sbenthall/poll.emic/issues/11#issuecomment-33748672 .
https://github.com/bkfunk/poll.emic/blob/geo/bin/mentionball.py
Line 44ish is where stuff starts. It's a big mess right now, though!
When building out the mentionball, currently data collection, snowball strategy, and final graph output are integrated.
Separating these into different stages will make it easier to save richer data about the users and their activity, and then derive multiple alternate graph representations for visualization and analysis.