neo4j-devtools / neo4j-bloom

A public repository for informal docs, problem reporting and content sharing related to Neo4j Bloom.
Apache License 2.0
18 stars 0 forks source link

Louvain Results Different in Bloom and Python Client #53

Closed Halkenhaeusser closed 5 months ago

Halkenhaeusser commented 2 years ago

Hi,

I am generated a graph that contains one node type and one relationship type. The nodes all have a property called "cluster". Using the Louvain algorithm, I want to show that the clusters can be identified.

Using Bloom the nodes are correctly sorted into the 5 communities corresponding to the node's "cluster". Using the python client on the same graph, 10 communities are found. When I change the seedProperty of the python algorithm to the cluster property, 5 communities are found.

Why are my results with default options different between Bloom and Python?

Thank you

Python Code

G,_ = gds.graph.project('make_hist',
                    {'meta': {"properties": 'cluster'}},
                    ['has_sorensen']
                )

res = gds.louvain.mutate(G, mutateProperty = 'community_default')

res['communityCount']
 10

res
mutateMillis                                                             0
nodePropertiesWritten                                                   50
modularity                                                        0.683894
modularities             [0.48768643773846687, 0.6816973060469418, 0.68...
ranLevels                                                                3
communityCount                                                          10
communityDistribution    {'p99': 19, 'min': 1, 'max': 19, 'mean': 5.0, ...
postProcessingMillis                                                     2
preProcessingMillis                                                      0
computeMillis                                                           59
configuration            {'maxIterations': 10, 'seedProperty': None, 'c...
Name: 0, dtype: object

Bloom with nodes colored by Louvain community

image
yirensum commented 2 years ago

Hi @Halkenhaeusser, apologies for the really late reply!

I'm just looking into this, and this is my first guess at what could be the problem.

When Bloom runs GDS, it essentially only evaluates an algorithm on all the nodes and rels which are currently on the scene.

From your python output above, the number of nodes for which louvain is evaluated for is 50 (nodePropertiesWritten 50). However, there are 44 nodes currently on the scene in Bloom.

Could you try loading all the nodes into the scene in Bloom, and it again?

Thanks!

ckanz commented 5 months ago

I will be closing this issue since it's been some time since the last reply. If you have any issues with Bloom, please feel free to raise a new issue here or on Canny https://neo4j-aura.canny.io/explore