Open Neutel opened 1 month ago
Hey there, @Neutel! I'm here to help you with any bugs, questions, or contributions. Let's tackle this together while we wait for a human maintainer.
Yes, you can filter in a query engine using a method similar to the ismatch
function with regex-like language (Lucene) available in the search endpoint. The OpenSearch vector store supports filter-context queries, and you can use the ExactMatchFilter
to apply filters in your queries. Here's an example of how you can create a query engine that only searches certain footnotes using filters:
from llama_index.core import Document
from llama_index.core.vector_stores import MetadataFilters, ExactMatchFilter
import regex as re
# Split the text into paragraphs.
text_chunks = documents[0].text.split("\n\n")
# Create a document for each footnote
footnotes = [
Document(
text=chunk,
id=documents[0].doc_id,
metadata={"is_footnote": bool(re.search(r"^\s*\[\d+\]\s*", chunk))},
)
for chunk in text_chunks
if bool(re.search(r"^\s*\[\d+\]\s*", chunk))
]
# Insert the footnotes into the index
for f in footnotes:
index.insert(f)
# Create a query engine that only searches certain footnotes.
footnote_query_engine = index.as_query_engine(
filters=MetadataFilters(
filters=[
ExactMatchFilter(
key="term", value='{"metadata.is_footnote": "true"}'
),
ExactMatchFilter(
key="query_string",
value='{"query": "content: space AND content: lisp"}',
),
]
)
)
res = footnote_query_engine.query(
"What did the author about space aliens and lisp?"
)
res.response
This example demonstrates how to use filters to narrow down search results based on specific metadata and query conditions [1].
To continue talking to Dosu, mention @dosu.
@dosu the field groups is a query separeated string so group1,group2,group3 and the filter i need is on that matches documents if it a certain group is in the field so i thing ExactMatchFilter doesn't quite work
To filter documents in a query engine where a field contains a comma-separated string and you want to match if a certain group is present, you can use the MatchAny
filter operator. This operator allows you to match any of the values in a list against the field. Here's how you can implement it:
conditions.append(
FieldCondition(
key=subfilter.key,
match=MatchAny(any=values),
)
)
In this code snippet, subfilter.key
is the field you want to filter on, and values
is a list of values you want to match against the comma-separated string field. This approach is suitable for scenarios where ExactMatchFilter
does not work because it allows for matching any of the specified values within the field [1].
To continue talking to Dosu, mention @dosu.
Question Validation
Question
In the search enpoint there is the option to filter using a ismatch that uses a regex like language(lucene). I need to do a similiar thing in the query engine but cant't find a way to do so. The filter i have on the /search endpoint is search.ismatch('/,.{groupId},./', 'groups', 'full', 'any'). where groups is a comma separated string with group ids