Closed vatsrahul1001 closed 8 months ago
I checked the backend and direct query to weaviate with hybrid consistently retrieves the same results
from airflow.providers.weaviate.hooks.weaviate import WeaviateHook
_WEAVIATE_CONN_ID = "weaviate_prod"
WEAVIATE_CLASS= "DocsDev"
weaviate_client = WeaviateHook(_WEAVIATE_CONN_ID).get_client()
question = "Can I use Astro CLI to download DAGS from astronomer registry?"
def get_hybrid(question) -> set:
links = weaviate_client.query.get(WEAVIATE_CLASS, ["docLink"])\
.with_limit(5)\
.with_additional(["certainty","id"])\
.with_hybrid(
query=question,
# alpha=0.4
)\
.do()['data']['Get'][WEAVIATE_CLASS]
return {chunk['docLink'] for chunk in links}
links = get_hybrid(question)
for i in range(10):
new_links = get_hybrid(question)
assert links == new_links
I am wondering if multiqueryretriever is generating different questions each time resulting in different results retrieved.
This issue is related to the first attempt for hybrid search that wasn't working correctly. This issue is resolved and no longer relevant after successful correct implementation of hybrid search and reranker.
While testing we noticed for the same question we were getting opposite references and responses when asked multiple times
Incorrect Slack thread
Correct Slack thread
Noticed this with multiple times while testing today