Computational-Content-Analysis-2020 / Readings-Responses

Repository for organising "exemplary" readings, and posting reponses.
6 stars 1 forks source link

Extracting Communication Networks - Fortunato 2010 #24

Open jamesallenevans opened 4 years ago

jamesallenevans commented 4 years ago

Post questions here for:

Fortunato, Santo. 2010. “Community Detection in Graphs.” Physics reports 486(3-5): 75-174.

katykoenig commented 4 years ago

Section 5 of this paper addresses divisive algorithms and specifically notes the importance of Girvan & Newman's algorithm, where edges that are most likely to be between communities are removed (and node betweenness is defined by the number of shortest paths between pairs of nodes). The Girvan & Newman algorithm can lead to unbalanced partitions, and the paper notes that later research by Chen & Yuan proposes that only non-duplicate paths should be counted. The paper concludes that this yields better results that the standard edge betweenness measure used by the original algorithm. What does "better results" mean in this context? Do unbalanced partitions always present an issue and therefore should be avoided (even if it is the most logical way to divide a population)?

sunying2018 commented 4 years ago

This article demonstrates many techniques used for detecting communities. I'm interested in the Graph partitioning in section 4. For the Kernighan-Lin algorithm, we can find that the algorithm is quite fast considering the time complexity. However, we need a constant number of swaps at each iteration. I am wondering how the number of swaps at each iteration can be determined? Besides, it also mentions that the most expensive part of this algorithm - identification of the subsets to swap, however, it needs to compare all possible pairs and depends on a good "guess" about the sought partition. Considering the really expensive step of this algorithm and the cased-based results, why this algorithm is still frequently used and what's the advantage of this algorithms compared with other techniques?

tzkli commented 4 years ago

Hard to believe journals still accept papers of this length... But this one is incredibly informative. This paper mainly talks about unsigned networks. I was wondering what we should consider when choosing between signed and unsigned networks?

laurenjli commented 4 years ago

In section 6, the authors discuss the Clique Percolation Method. One of their main critiques of the method is that a large fraction of nodes are excluded from the final communities determined by CPM and that post-processing them into communities at the end is outside of the framework of the method. Given the other methods they discuss for overlapping communities, which would help with this situation?

di-Tong commented 4 years ago

Very informative piece! With regard to chapter 13, while the analysis of dynamic communities was in its infancy a decade ago, I wonder if any new method has been developed in the last decade to examine how real groups form and evolve in time.

deblnia commented 4 years ago

Are the methods described in this piece using entropy in the conventional, statistical physics way, or in some new way? My understanding was that network scientists dealt with entropy through partial measures (like Kolmogorov complexity) but the spin model described in this paper seems to deal with it directly (through an entropy term). Am I just misunderstanding the usage?

skanthan95 commented 4 years ago

Can the communication network analyses described in this reading be used to map out clinical disorders like depression (where nodes would be symptoms or affect states), and we would study how the symptoms affect and interact with each other?

bjcliang-uchi commented 4 years ago

In practice, for regression, we can calculate p-values for statistical significance. But in network analysis, it seems that unless we derive the features such as betweenness centrality and run it with some other variables, we cannot have any p-value--a cutoff which says whether certain patterns are important or not. I am therefore wondering how--especially for peer-reviewed papers--network analysis is used in combination with other methods.

heathercchen commented 4 years ago

This article is more of a short-length book than a paper! I am wondering how community analysis can be applied in the field of biology? Does it mean we can "differentiate" and "decentralize" tissues or cells in a way that we can use communities to understand how they interact with one another?

wunicoleshuhui commented 4 years ago

This is quite a lengthy article with an abundance of information. However, since the article puts more emphasis on the theoretical modeling of different approaches, I'm wondering, since the article was published in 2009, what should the empirical data look like now when we are collecting data to map communities besides the usual demographic surveys, email correspondences? How do we use digital data to map community networks similarly or differently from the approaches described in this article?

luxin-tian commented 4 years ago

I saw recently a Monte Carlo Simulation of the transmission of the 2019 nCov, and it triggers my thinking about community detection based on network analysis. Most of the literature that I read in the field of social sciences based network analysis uses static, cross-sectional, or mixed-panel-data. Even though we have network flow, it only captures the flow during a fixed time span. However, the interaction between nodes can be a real-time dynamic or stochastic process. I wonder is there any method that can capture the dynamic characteristics of the social games underlying the network?

gracefulghost31 commented 4 years ago

If time permits, I'd like to learn more about modularity optimization's solution limit and the resolution problem as it pertains to social science. Also given that the article was written in 2010, I'd be also interested in gaining updated perspectives about a posterior of community.

lkcao commented 4 years ago

This interesting piece introduces to us some most common techniques in community detection, as well as some application scenarios. I am curious about whether there are any correlation between these methods and specific analytical scenario: does some methods perform better in some disciplines? If so, why? Should we avoid some methods when doing analysis about a specific kind of network?

ccsuehara commented 4 years ago

as in @di-Tong 's question, I am also interested in the topic of Detection of Dynamic Communities, my main motivation is that all the papers for this week (orienting and exemplary) involve time, or evolution, as an important aspect, specially, the Fringe Effect, which seems to use this methodology.

ziwnchen commented 4 years ago

This paper systematically introduces us to the field of community detection. However, based on my past experience, existing python packages/resources about community detection methods are quite limited. For example, networkx only provides several fairly simple community detection methods. So my question is, where could we find potential python/r/other packages that allow us to do community detection on networks?

VivianQian19 commented 4 years ago

Fortunato’s article on community detection gives a very detailed description of clustering techniques used in community detection. I’m wondering how to choose from these clustering techniques in different situations and what are the advantages and disadvantages of using them?

alakira commented 4 years ago

I am particularly interested analyzing social network where there are a lot of edges. In case of twitter, there are too many following/follower edges which is hard to disentangle after plotting them into one space. It is possible to change colors or size of each node according to some of its centrality measures or some measures based on community, but it is not feasible if there are too many communities. Practically, what is the best procedure to find appropriate measure for illustrating the characteristics of a certain network?

sanittawan commented 4 years ago

This piece serves as a great reference to existing methods for community detection, but I do wonder how researchers go about choosing the quantitative definition of community given that there is really no universal definition. Is it more of an iterative process where one starts with a definition and see how the result is and revise it as the research progresses? What are the popular quantitative definitions that are used in sociological research?

kdaej commented 4 years ago

In reality, membership of communities can be more complicated and ambiguous than those described in this article. If there are multiple memberships at the same time or different memberships in various situations, how can we model these?

cytwill commented 4 years ago

This paper is quite useful for my project and I would like to further dig it later. Firstly, I have a similar question like that of @ziwnchen, are there any high-leveled or integrated community detection tools currently available in python? Also, I feel that the detection of communities is somewhat like another type of clustering. The special point is that they put more emphasis on the edges/connectivity between vertex but I think in general clustering more attention is paid to the features of the observations (vertex). So I am wondering if it is helpful to add the features of vertex in the detection of communities?