getzep / graphiti

Build and query dynamic, temporally-aware Knowledge Graphs
https://help.getzep.com/graphiti
Apache License 2.0
1.39k stars 75 forks source link

Add mmr reranking #180

Closed prasmussen15 closed 1 month ago

prasmussen15 commented 1 month ago

[!IMPORTANT] Add MMR reranking to search, normalize embeddings, and update community building with group ID support.

  • Search Enhancements:
    • Add MMR reranking method to search.py for edges, nodes, and communities.
    • Update search_config.py and search_config_recipes.py to include MMR reranking.
    • Add maximal_marginal_relevance() function to search_utils.py.
  • Embedding Normalization:
    • Use normalize_l2() in openai.py and voyage.py to normalize embeddings.
  • Community Building:
    • Update build_communities() in graphiti.py and community_operations.py to accept group_ids.
  • Testing:
    • Update test_graphiti_int.py to test new search configurations and community building.

This description was created by Ellipsis for 6f69a613efee88e3e2fc379933c2958ac88d8cf8. It will automatically update as commits are pushed.

danielchalef commented 1 month ago

@prasmussen15 it looks like we're normalizing vectors. Do we need to run a migration on existing installs? Would it now make sense to use dot product rather then cosine as distance metric?

prasmussen15 commented 1 month ago

@prasmussen15 it looks like we're normalizing vectors. Do we need to run a migration on existing installs? Would it now make sense to use dot product rather then cosine as distance metric?

Neo4j only offers cosine similarity and not dot product so a migration on existing installs isn't necessary for the search. We could run a migration for MMR since I use dot product there.