Lab41 / Circulo

Community Detection Research Effort
http://lab41.github.io/Circulo/
Other
79 stars 39 forks source link

request for explanation and clarification for some evaluation metric #58

Closed fa98 closed 9 years ago

fa98 commented 9 years ago

Hi, I was wondering if you could help me understanding some concept : Conductance, Internal density,expansion, Cutratio,NormalizedCut are metrics which dosen't need ground truth so after using for example edgebetweenness and walktrap algorithm to find the communities we could use these metrics to compare the goodness of communities find with these algorithm but I am a little confused : if I take this function and use them as evaluation metric for the whole network (calculate the mean of the F(s) [6] function of all the communities using a community detection algorithm ) and compare the mean output of these two algorithm for each metric could I reach this conclusion that if this metric is lower so this algorithm capture better communities ? ( "We consider the following metrics f(S) that capture the notion of a quality of the cluster. Lower value of score f(S) (when |S| is kept constant)" [6] ) but in here the S is not constant and each community has it's own size and you intend to have the measure for the whole network so is my understanding right or wrong ?

[6] : refrence6 of the survey : Empirical comparison of algorithms for network community detection

Thank you so much

Lab41PaulM commented 9 years ago

This is a good question that perhaps should be posed to the authors of the paper which you reference. The goal of the Circulo framework is to merely provide those that ask such questions with an efficient working environment for coming to answers. I think it would be worth while to conduct an experiment which determines which algorithms tend to lead to higher values of specific metrics. For example, algorithm A is optimized for Density, therefore if my notion of a good community is Density, then I shall use algorithm A.