gusye1234 / nano-graphrag

A simple, easy-to-hack GraphRAG implementation
MIT License
1.66k stars 160 forks source link

Neo4j storing takes huge amount of time! #81

Open rongzunzzz opened 1 month ago

rongzunzzz commented 1 month ago

Does anyone encounter this when just simply applying the guide of docs/use_neo4j_for_graphrag.md. I use local neo4j desktop app. My terminal pops lots of message of querying API keys and Processed A LOT MORE Communities, i wonder why.

gusye1234 commented 1 month ago

Yeah, I encountered the same issue. Because nano-graphrag is using the GDS ext of neo4j to compute the communities, and it seems to produce a lot more communities than networkx and unstable given a fixed random seed.

rongzunzzz commented 1 month ago

Thanks for replying. Just wonder why this cause it to produce more communities, i thought it would be the same or approximately the same amount of communities for the same dataset. For instance, for my dataset, it normally constructs 90 communities, but when connecting neo4j, it reaches an amount of 1500 and ongoing. That's weird.

On the other hand, I have checked my local neo4j database, visualized the graph and I was sure that the number of nodes and links are completely constructed, but I terminated the process because I could not see the end of process. What about you? Did your execution finish?