neo4j-contrib / neo4j-graph-algorithms

Efficient Graph Algorithms for Neo4j
https://github.com/neo4j/graph-data-science/
GNU General Public License v3.0
771 stars 194 forks source link

how can I use algo to do community detection for subgrap which select by match cypher ? #583

Open heyichang opened 6 years ago

heyichang commented 6 years ago

CALL algo.louvain( 'MATCH (n:Account) RETURN id(n) as id', 'match (n1:Account{nodeID:"BDB8106C5276AE4AB67EF8FE9F7E4282"})-[r1:follow]->(n2:Account)-[r2:follow]->(n3:Account) RETURN id(n1) as source, id(n3) as target, count(*) as weight', {graph:'cypher', iterations:5, write: true});

but this can't work ,

loadMillis computeMillis writeMillis nodes iterations communityCount
171 125 85 29889 0 29889

I have 29889 nodes and also have community count,it didn't select graph and do in this

heyichang commented 6 years ago

And then ,I change my cypher

CALL algo.louvain(
'MATCH (n:Account{nodeID:"BDB8106C5276AE4AB67EF8FE9F7E4282"}) RETURN id(n) as id union match (n1:Account{nodeID:"BDB8106C5276AE4AB67EF8FE9F7E4282"})-[r1:follow]->(n2:Account) return id(n2) as id union match (n1:Account{nodeID:"BDB8106C5276AE4AB67EF8FE9F7E4282"})-[r1:follow]-(n2:Account)-[r2:follow]-(n3:Account) return id(n3) as id ',
'match (n1:Account)-[r1:follow]->(n2:Account)-[r2:follow]->(n3:Account)
RETURN id(n1) as source, id(n3) as target',
{graph:'cypher', iterations:100, write: true});

And it is also can't get the good effect

heyichang commented 6 years ago

the consequence

loadMillis  computeMillis   writeMillis nodes   iterations  communityCount
2748    14020   257 19705   3   19698
mknblch commented 6 years ago

Hi. In my tests, whenever the community count is equal or almost equal to the count of nodes there is something wrong with the relationship import. Either due to a wrong query or a bug. The algorithm sets the node-id as community-id at statup but if important relationships are missing the algo never touches those nodes again. Which, in turn, ends up in a wrong community count (and short computation duration).

On the first look it seems your relationship query is too specific. Could you possibly create a testgraph which reproduces the problem? You could also try to relabel your subgraph and call the algorithm with the new label and rel-name.