neo4j / graph-data-science

Source code for the Neo4j Graph Data Science library of graph algorithms.
https://neo4j.com/docs/graph-data-science/current/
Other
621 stars 160 forks source link

Odd Result on Louvain Community Detection #172

Closed andyhegedus closed 2 years ago

andyhegedus commented 2 years ago

Hi, Working on graph and have a subgraph which I did a Louvain community detection. The code to create the virtual graph is CALL gds.graph.create.cypher( "BASE_CPC", "MATCH (n:patent) WHERE exists {(n:patent)-[:Classified_as]->(a:cpc) where a.subgroup in ['H01L21/0228','H01L21/0229','H01L21/28194','H01L21/3141','H01L45/1616']} RETURN id(n) AS id, labels(n) AS labels", "MATCH (a:patent)-[:Cites]->(b:patent) return id(a) AS source, id(b) AS target", {validateRelationships: FALSE} );

The Louvain is done with this CALL gds.louvain.write('BASE_CPC', { writeProperty: 'CID',consecutiveIds:true,minCommunitySize:10 }) YIELD communityCount, modularity, modularities;

With the important thing to note is that I am using the [:cites] relationship to define the relationship between the nodes. In looking at the results within Bloom I am filtering by CID to validate the clusters.

In general they make sense, however in looking at this one

image

I notice that included in the community is one node that does NOT have a relationship of [:Cites] with the bulk of the nodes. I validated this by selecting all nodes and asking to reveal relationships within Bloom. The coloring on the relationship represent a property key and can be ignored.

Is it possible to have clusters where individual nodes are not connected, it seems to defeat the purpose. Andy

vnickolov commented 2 years ago

@andyhegedus thank you for raising this issue.

Can you please provide some additional information

Thank you in advance.

knutwalker commented 2 years ago

@andyhegedus we're gonna close this issue since we cannot reproduce the problem. If you have more information on the questions that @vnickolov shared above, feel free to re-open the issue.