Open justinzyw opened 2 months ago
To use a vector store and a graph store together in one query with LlamaIndex, you can utilize the VectorContextRetriever
class, which allows querying the graph store with a vector store query. Here is an example of how you can structure such a query:
from llama_index.core.indices.property_graph.sub_retrievers.vector import VectorContextRetriever
from llama_index.core.vector_stores.types import VectorStoreQuery
from llama_index.core.schema import QueryBundle
# Initialize the graph store and vector store
graph_store = ... # Your PropertyGraphStore instance
vector_store = ... # Your VectorStore instance
# Create a VectorContextRetriever instance
retriever = VectorContextRetriever(
graph_store=graph_store,
vector_store=vector_store,
similarity_top_k=4,
path_depth=1,
similarity_score=0.5
)
# Create a query bundle
query_bundle = QueryBundle(
embedding_strs=["your query text here"]
)
# Retrieve nodes from the graph store using the vector store query
nodes_with_scores = retriever.retrieve_from_graph(query_bundle)
# Process the retrieved nodes as needed
for node_with_score in nodes_with_scores:
print(node_with_score.node, node_with_score.score)
This method constructs a query that integrates both vector and structured data conditions, allowing you to query the graph store using vector store queries. The VectorContextRetriever
class handles the retrieval process, including embedding the query text, querying the vector store, and fetching related nodes and their relationships from the graph store [1].
For example, you can use the FalkorDBPropertyGraphStore
class from the LlamaIndex framework to set up and query your graph store:
from llama_index.core.indices.property_graph import PropertyGraphIndex
from llama_index.graph_stores.falkordb import FalkorDBPropertyGraphStore
# Create a FalkorDBPropertyGraphStore instance
graph_store = FalkorDBPropertyGraphStore(
url="falkordb://localhost:6379",
database="falkor"
)
# Create the index
index = PropertyGraphIndex.from_documents(
documents,
property_graph_store=graph_store,
)
This setup allows you to use both vector and structured queries on your graph store [2].
To continue talking to Dosu, mention @dosu.
@dosu, my vector store is a SimpleVectorStore, does it work for the method above?
Yes, the method VectorContextRetriever
works with a SimpleVectorStore
for using a vector store and a graph store together in one query. The VectorContextRetriever
class is designed to retrieve nodes based on a query using both a vector store and a graph store. It supports scenarios where the graph store does not support vector queries by allowing an optional vector store to be provided.
Here is the relevant part of the code that shows how it handles both the graph store and the vector store:
class VectorContextRetriever(BasePGRetriever):
def __init__(
self,
graph_store: PropertyGraphStore,
include_text: bool = True,
embed_model: Optional[BaseEmbedding] = None,
vector_store: Optional[VectorStore] = None,
similarity_top_k: int = 4,
path_depth: int = 1,
similarity_score: Optional[float] = None,
filters: Optional[MetadataFilters] = None,
**kwargs: Any,
) -> None:
self._retriever_kwargs = kwargs or {}
self._embed_model = embed_model or Settings.embed_model
self._similarity_top_k = similarity_top_k
self._vector_store = vector_store
self._path_depth = path_depth
self._similarity_score = similarity_score
self._filters = filters
super().__init__(graph_store=graph_store, include_text=include_text, **kwargs)
def retrieve_from_graph(self, query_bundle: QueryBundle) -> List[NodeWithScore]:
vector_store_query = self._get_vector_store_query(query_bundle)
triplets = []
kg_ids = []
new_scores = []
if self._graph_store.supports_vector_queries:
result = self._graph_store.vector_query(vector_store_query)
if len(result) != 2:
raise ValueError("No nodes returned by vector_query")
kg_nodes, scores = result
kg_ids = [node.id for node in kg_nodes]
triplets = self._graph_store.get_rel_map(
kg_nodes, depth=self._path_depth, ignore_rels=[KG_SOURCE_REL]
)
elif self._vector_store is not None:
query_result = self._vector_store.query(vector_store_query)
if query_result.nodes is not None and query_result.similarities is not None:
kg_ids = self._get_kg_ids(query_result.nodes)
scores = query_result.similarities
kg_nodes = self._graph_store.get(ids=kg_ids)
triplets = self._graph_store.get_rel_map(
kg_nodes, depth=self._path_depth, ignore_rels=[KG_SOURCE_REL]
)
elif query_result.ids is not None and query_result.similarities is not None:
kg_ids = query_result.ids
scores = query_result.similarities
kg_nodes = self._graph_store.get(ids=kg_ids)
triplets = self._graph_store.get_rel_map(
kg_nodes, depth=self._path_depth, ignore_rels=[KG_SOURCE_REL]
)
for triplet in triplets:
score1 = (
scores[kg_ids.index(triplet[0].id)] if triplet[0].id in kg_ids else 0.0
)
score2 = (
scores[kg_ids.index(triplet[2].id)] if triplet[2].id in kg_ids else 0.0
)
new_scores.append(max(score1, score2))
assert len(triplets) == len(new_scores)
# filter by similarity score
if self._similarity_score:
filtered_data = [
(triplet, score)
for triplet, score in zip(triplets, new_scores)
if score >= self._similarity_score
]
# sort by score
top_k = sorted(filtered_data, key=lambda x: x[1], reverse=True)
else:
# sort by score
top_k = sorted(zip(triplets, new_scores), key=lambda x: x[1], reverse=True)
return self._get_nodes_with_score([x[0] for x in top_k], [x[1] for x in top_k])
This code shows that the VectorContextRetriever
can use a vector store (self._vector_store
) if the graph store does not support vector queries, allowing it to retrieve and combine data from both stores in one query [1].
To continue talking to Dosu, mention @dosu.
Question Validation
Question
I have created a vector store and a graph store seperately. How can I use both of them in one query?