neo4j-contrib / neo4j-graph-algorithms

Efficient Graph Algorithms for Neo4j
https://github.com/neo4j/graph-data-science/
GNU General Public License v3.0
767 stars 196 forks source link

Triangle Count coefficient that are returned are incorrect. #903

Open tomzeppenfeldt opened 5 years ago

tomzeppenfeldt commented 5 years ago

I have the impression that

CALL algo.triangleCount.stream('Person', 'KNOWS', {concurrency:4})
YIELD nodeId, triangles, coefficient

RETURN algo.asNode(nodeId).id AS name, triangles, coefficient
ORDER BY coefficient DESC

as illustrated on https://neo4j.com/docs/graph-algorithms/current/algorithms/triangle-counting-clustering-coefficient/ (table 6.25) returns the wrong results:

from https://en.wikipedia.org/wiki/Clustering_coefficient : The local clustering coefficient {\displaystyle C{i}} C{i} for a vertex {\displaystyle v{i}} v{i} is then given by the proportion of links between the vertices within its neighbourhood divided by the number of links that could possibly exist between them. For a directed graph, {\displaystyle e{ij}} e{ij} is distinct from {\displaystyle e{ji}} e{{ji}}, and therefore for each neighbourhood {\displaystyle N{i}} N{i} there are {\displaystyle k{i}(k{i}-1)} k{i}(k{i}-1) links that could exist among the vertices within the neighbourhood ( {\displaystyle k{i}} k{i} is the number of neighbours of a vertex).

For node “Michael”, 3 out of 6 possible edges connecting all vertices in their neighbourhood actually exist, so the coefficient should be 0,5 instead of 0,3