ptwobrussell / Mining-the-Social-Web

The official online compendium for Mining the Social Web (O'Reilly, 2011)
http://bit.ly/135dHfs
Other
1.21k stars 491 forks source link

Graphing 2nd degree friendships with redis & networkx #49

Closed CMcGill closed 11 years ago

CMcGill commented 11 years ago

I have gone through your book and found it to be a really awesome resource as I try to teach myself more about this fascinating subject. I also really appreciated the updates to make the code work with the twitter 1.1 api, as that had been causing me some problems initially as I worked through some of the examples.

After getting all the example scripts working now, I have started making some of my own modifications as a way to learn more (still new to programming and development in general) I have been playing around with the script "friends_followers__redis_to_networkx" and after reading a bit of the networkx documentation I figured out how to have it write the files as .gexf instead of pickling them, since I have also been playing around with Gephi and like to view the results using that app. One thing I would still like to figure out, but am kind of stuck on, is how to modify the script to include the 2nd degree nodes in the graph file. I know that the graphs could quickly become very large, I'd like to be able to figure out how. I have tried modifying the for loops, to iterate through all friends of friends for the ID in redis and add them to the graph as well, but I must be doing it wrong, since I always end up with the same output when viewing the graph file (ie I can see all the 1st degree friendships, as well as edges that exist between those nodes, but I don't see any of the actual 2nd degree nodes themselves) I realize this is what the script is designed to do, but I wondered if you could give me any help making this tweak to it.

In any case, thanks for all your work on this awesome book, it has really been a great help getting started!

ptwobrussell commented 11 years ago

Hi - sounds like a fun exercise...

In reviewing the code, it looks like the friend_ids variable is limited to only the friend ids for the subject of interest, so no matter what you do, that's all it has to compute with. In other words, your graph can never have any nodes except for those nodes. Here's what you'd need to do to get started...

Make sense? If so, please close the issue and let me know how things go by leaving a comment once you are able to explore a bit. I'd definitely recommend starting with just a small data set for comparison. e.g. pick a person and just get 50 friend ids for them and 50 more friend (of friend) ids for each of those friend ids.

CMcGill commented 11 years ago

Thanks, that helps a lot! This will help me get going in the right direction, will let you know how it goes!