aws-samples / aws-lex-web-ui

Sample Amazon Lex chat bot web interface
Other
746 stars 472 forks source link

ResponseCard LexV2 input controls not visible in webui (Question) #398

Closed anindita-mahapatra closed 8 months ago

anindita-mahapatra commented 2 years ago

Hello! I see this question asked earlier but I'm not getting past the issue, so seeking some clarification & help here: 1) I have a Lex V2 bot 2) I can test it using both text and voice on the lex v2 test console - works fine 3) it works with text on webui but the input controls (text and voice controls at the bottom) are not visible - which means I can only click on the buttons - cannot use the text/voice controls

Are there any specific setting in the 'ui' section of lex-web-ui-loader-config.json that I should be using to control this? Once I finish the response card, the controls appear at the bottom of the page.

Are there any specific values to be sent from lambda ? I saw some references to appContext - but it looked like it was for lex V1? I did not see any reference to sessionAttributes['appContext'] in the official documentation. Is that really needed? My elicitSlot looks something like this:

{'contentType': 'ImageResponseCard', 'imageResponseCard': {'title': 'How can we help you?', 'imageUrl': '...', 'buttons': [{'text': 'A', 'value': ' ...'}, ...]}, 'content': 'some text' } I've been struggling with this for a while, so any help will be appreciated. thanks!

bobpskier commented 2 years ago

@anindita-mahapatra Is your elicitSlot example content being passed back from a Lambda handler?

The lex-web-ui, while in voice mode, will not display image response cards unless they are passed using session attributes in the appContext.responseCard. See https://github.com/aws-samples/aws-lex-web-ui/blob/master/lex-web-ui/README.md#response-cards. This is a limitation on how Lex processes postContent / recognizeUtterance (voice). Its not in the official Lex documentation as it is only used in Lex-Web-Ui to overcome a Lex api limitation. Lex does not return the response card to the UI when in voice mode. The only way to get this back to a web ui is passing them back as a session attribute. You can always file an enhancement request to the Lex Service team but this as far as I am aware this has been the case since Lex was first released.

Also the lex-web-ui text input field will not be available once voice mode is started. Instead the interaction is entirely via voice until the user exits voice mode or the intent becomes fulfilled.

anindita-mahapatra commented 2 years ago

thanks @bobpskier - I tested some more & here are my findings:

1) even after the speech is recognized correctly & shown in the human message bubble there is an additional 'sorry there is an error' message which is coming from the webui (I don't see it in the lex v2 console testing)

Request URL: https://polly.us-east-1.amazonaws.com/v1/speech Request Method: POST Status Code: 200 OK

{Text: "There was an error", VoiceId: "Salli", OutputFormat: "ogg_vorbis", TextType: "text"} OutputFormat: "ogg_vorbis" Text: "There was an error" TextType: "text" VoiceId: "Salli"

How can we debug this? no network error on inspect browser, no error in Lambda - what is it unhappy about?

2) With the voice option, the bot does not wait long enough https://github.com/aws-samples/aws-lex-web-ui/blob/master/lex-web-ui/src/config/index.js I added "quietTimeMin": "0.5","quietThreshold":"0.002" but it didn't seem to matter

I'm wondering if I could request 15 min of your time to show you my setup and seek your guidance thanks a lot!

bobpskier commented 2 years ago

@anindita-mahapatra With respect to "The first speech instance was not being recognized" is this in regards to using the configuration parameter "initialSpeechInstruction"? If no, Polly:synthesize is not needed to interact with Lex voice.

In order to use initialSpeechInstruction, Polly:synthesize needs to be added to the cognito no auth role if you want to support this for unauthenticated users. However, its already configured in the authenticated role and would not need to be added. When you add it to the no auth role, then anyone can use your cognito no auth role to request polly synthesize operations which might not be what you want.

The use of ['sessionState']['sessionAttributes']['appContext'] was probably incorrect leading to issues with voice and text modes. Would have to look at the response payload to assess what might be wrong.

For 1. above, what is the data being shown? Is this a request object and a response payload? The only time the LexWebUi will make a request to polly is for the InitialSpeechInstruction. Polly looks like it returned an error. Can you supply the request payload for this post operations?

Can you clarify what is meant in 2. by "does not wait long enough?" Is it that the user waits for say 7 seconds before speaking and the bot times out waiting for input? If yes, have a look at adjusting recordingTimeMin to see if that has an affect.

    // Minimum recording time in seconds.
    // Used before evaluating if the line is quiet to allow initial pauses
    // before speech
    recordingTimeMin: 2.5,

Normally this type of debugging / adjustments will take minimum of 30 minutes and could take a couple of hours to look at the implementation and make suggestions. I'm really not available for this time of project work unless I setup an SOW if that is of interest.

anindita-mahapatra commented 2 years ago

Hello @bobpskier, yes I would like to consider your offer, please email me at anin_dita@rocketmail.com so we can take this offline, thanks!