aws-samples / amazon-kendra-langchain-extensions

Samples to build Generative AI applications with LangChain and Amazon Kendra
https://aws.amazon.com/blogs/machine-learning/quickly-build-high-accuracy-generative-ai-applications-on-enterprise-data-using-amazon-kendra-langchain-and-large-language-models/
MIT No Attribution
158 stars 104 forks source link

Different results in retriever and AWS console. #41

Open Sypek opened 11 months ago

Sypek commented 11 months ago

Hi, I found out that I get different results from asking any question (i.e.: "What is Amazon Sagemaker?") using:

Are there any additional settings that I should make in my code to get the same results? Moreover, while results from Console are accurate, the ones I get from retriever are not especially accurate.

lohitaudhkhasi commented 7 months ago

Hi, I am also facing the similar issue. Results from AWS Console Kendra and Retriver are not same

harshtrun commented 6 months ago

Hi,

Kendra Search console uses Query API, which is a bit different from the Retrieve API. Found the below on AWS Docs:

Retrieve API is similar to the Query (https://docs.aws.amazon.com/kendra/latest/APIReference/API_Query.html) API. However, by default, the Query API only returns excerpt passages of up to 100 token words. With the Retrieve API, you can retrieve longer passages of up to 200 token words and up to 100 semantically relevant passages. This doesn't include question-answer or FAQ type responses from your index. The passages are text excerpts that can be semantically extracted from multiple documents and multiple parts of the same document. If in extreme cases your documents produce zero passages using the Retrieve API, you can alternatively use the Query API and its types of responses.

The results would be similar when using the Query API.