Closed dmarx closed 7 years ago
On the topic of the modified bipartite projection... I'm pretty sure I have the user-subreddit activities data pulled down. If I can build a generalized routine for doing this modified bipartite projection operation, could be a flexible way of working with this dataset. Would also be a generally useful tool to have.
Coded up demo in github gist: https://gist.github.com/dmarx/8981ad329506b250498f1f52c3050a9c
Modified bipartite projection - The most common approach (also the approach taken by Randy) is to treat the social network as a bipartite graph, where the two node types are users and communities, and infer community relationships from the bipartite projection. My approach is a modified bipartite projection, where each undirected "subreddit <---> subreddit" projection edge is decomposed into two directed edges whose weights are rescaled by the count of unique users at each node. This makes each edge directly interpretable as "the proportion of users who participate in the target node who also participated in the source node. Using the target node as for weight rescaling has the effect that if there is only one directed edge between two nodes, it will always point to the smaller community. Edge direction is then suggestive of increased node specialization, i.e. following a directed path in the subreddit network will likely eventually lead you down a trail of increasingly esoteric subreddits, e.g. sports -> nfl -> fantasyfootball -> findaleague
There was something else I wanted to add but I can't remember what it was anymore :(