aws-solutions / qnabot-on-aws

AWS QnABot is a multi-channel, multi-language conversational interface (chatbot) that responds to your customer's questions, answers, and feedback. The solution allows you to deploy a fully functional chatbot across multiple channels including chat, voice, SMS and Amazon Alexa.
https://aws.amazon.com/solutions/implementations/aws-qnabot
Apache License 2.0
401 stars 253 forks source link

QnA wont get any content from Kendra , if the Kendra content is not in english. #713

Closed Guillaume-Bourque-Levio closed 5 months ago

Guillaume-Bourque-Levio commented 7 months ago

Describe the bug From the default QnA client only English content will be replied by QnA bot.

Even if the QnA langage is set to French, and that we have content in French in Kendra, no data is return to my french question.

As soon as I add a data source in English to Kendra I see only those answer in the chat bot.

To Reproduce Create de default QnA bot and select French as language

Create a kendra index and add a web crawler with data in french.

Expected behavior French response available in french that we can see from the kendra console if we set the language to Frnech

Please complete the following information about the solution:

response = kendra.retrieve(
PageSize = 5,
PageNumber = 1,
QueryText = text,
IndexId = index_id,
AttributeFilter={ "EqualsTo": {
"Key": "_language_code",
"Value": {
"StringValue": "fr"
}
}

Guillaume.

dougtoppin commented 7 months ago

@Guillaume-Bourque-Levio thanks for your report, we will investigate and get back to you

bios6 commented 7 months ago

Hi @Guillaume-Bourque-Levio ,

Please try the following and confirm: For this go to the AWS Console and search for Kendra, then look for the Data Management on the left hand side of the AWS console and click on Search indexed content . Then try query your question there that you are asking QnABot (in French). If it is not retrieving anything then it's a issue on the Kendra side. If it is indeed an issue on the Kendra side then try to set the Kendra data sources language to be English (which is the default) and you can still pass in the French data sources and you should be able to query in French.

Thanks!

Guillaume-Bourque-Levio commented 7 months ago

Hello,

if I try to search in the text box I get nothing because the search only search in the english documents and my document are in french.

But if I change then kendra search behavior to French as you see here kendra will return information from my French documents.

So from what I understand Kendra is fine

image

TIA

bios6 commented 7 months ago

Hello, In this case can you change KENDRA_INDEXED_DOCUMENTS_LANGUAGES to be fr in the QnABot Settings ? Thanks!

Guillaume-Bourque-Levio commented 7 months ago

Hello, if a replace the en by fr I get no result.

Can we specify more than 1 language in that parameter ?

Best

bios6 commented 7 months ago

Hi @Guillaume-Bourque-Levio ,

I'm unable to see any issue with Kendra integration with QnABot. I just tried to deploy an environment with Swedish as the Language in the CloudFormation Parameter and a Kendra Index with Swedish data and I am also able to query in Swedish (with or without the multi-language setting enabled) as well as other languages (if I have multi-language setting enabled ) . Our recommendation would be to turn on your debugging in the settings and as well as try to change around the score threshold for Kendra to see if that helps.

To answer your previous question KENDRA_INDEXED_DOCUMENTS_LANGUAGES supports a comma separate list. For more information about the settings on QnABot you can check out all parameters and descriptions here: https://github.com/aws-solutions/qnabot-on-aws/blob/main/docs/settings.md . You can also try to download our Implementation guide here https://aws.amazon.com/solutions/implementations/qnabot-on-aws/ which contains a lot of information about this AWS solution.

Screenshot 2024-04-16 at 8 43 51 AM Screenshot 2024-04-16 at 8 48 26 AM
Guillaume-Bourque-Levio commented 7 months ago

Hello @bios6

could you please share your QnA bot configurations parameters that allow you solution to get the swedish information,

TIA.

bios6 commented 7 months ago

Hi @Guillaume-Bourque-Levio ,

Here's some images of how I setup my cloudformation parameters and qnabot settings . I only recall touching 2-3 settings mainly for debugging but most are default.

CFN Parameters:

Screenshot 2024-04-16 at 2 25 14 PM Screenshot 2024-04-16 at 2 26 44 PM

QnABot settings:

Screenshot 2024-04-16 at 2 32 06 PM Screenshot 2024-04-16 at 2 32 28 PM Screenshot 2024-04-16 at 2 33 54 PM Screenshot 2024-04-16 at 2 35 44 PM

I hope this helps!

Guillaume-Bourque-Levio commented 7 months ago

Thanks, I'm now able to get my french answer.

fhoueto-amz commented 7 months ago

@Guillaume-Bourque-Levio Can you please share what was the root cause of the issue on your side. @bios6 shared the suggestion on the KENDRA_INDEXED_DOCUMENTS_LANGUAGES a few comments before but you mentioned that it was still not working then.

Guillaume-Bourque-Levio commented 7 months ago

@bios6 can you confirm that in your kendra index you had no English content at all, when you did your test ?

Also was kendra configures from the QnA bot console, or you when into Amazon console and created an index from their ?

We are usinge the crawler v2 extentions to bring data in kendra with the locale set to fr.

We are still having a hard time to have a working solution if we have no english content in Kendra

Best

Guillaume-Bourque-Levio commented 7 months ago

@fhoueto-amz ,

I went into the Amazon console and create a Kendra index with no particular configuration. Then I add only french local with french data using the crawler v2.

From QnA default web client I tried to get info from what I see in kendra, but no luck. But from Amazon kendra search I can see all my french content

If I add french content aging with the webcrawler but with the default content to En, then from the QnA default web client I can get answers.

This variable KENDRA_INDEXED_DOCUMENTS_LANGUAGES is not use or dont affect the way answer are given for our simple setup.

Should we let the QnA cloudformation stack create the kendra index ? Because we have created the Kendra config outside the cloudformation stack, but as said earlier the solution work fine if we only have english content, so this is not a missing policy.

That's what I understand ritgh now.

Guillaume-Bourque-Levio commented 7 months ago

Maybe one more note in canada the locale sent by the browser is fr_CA not fr could that be the issue, since kendra source is fr not fr_CA ?

fhoueto-amz commented 7 months ago

@Guillaume-Bourque-Levio, Currently you need to index in english for this to work as you did. We are looking into the issue.

fhoueto-amz commented 5 months ago

Fixed in v6.0.0