aws-samples / aws-lex-web-ui

Sample Amazon Lex chat bot web interface
Other
734 stars 466 forks source link

Image Response cards and Title are not displayed on Web UI for Lex V2 using voice #432

Closed bmanoj-aws closed 6 months ago

bmanoj-aws commented 2 years ago

Hi Team, lex bot doesnot display image response cards and title using voice but displays using text.

Attaching the piece of lambda code used in my bot and screenshots below.

attributesdata = event['sessionState']['sessionAttributes']
attributesdataloaded = json.dumps(attributesdata)
return{
        "sessionState": {
            "sessionAttributes":{
                "session_attributes": attributesdataloaded},
            "dialogAction": {
                "type": "ElicitSlot",
                "slotToElicit": "orderselect"
            },
            "intent": {
                "name": event['sessionState']['intent']['name'],
                "slots": event['sessionState']['intent']['slots'],
            }
        },
        "messages": [{
                       "contentType": "ImageResponseCard",
                       "imageResponseCard": 

      {"title": "data",
        "imageUrl":"https://images.jpg"}}]
    }

Lex Response With Text:

withtext

Lex Response With Voice:

withvoice
bobpskier commented 2 years ago

@bmakumar This is how the Lex service api works with respect to response cards and voice. The postContent (v1) and recognizeUtterance (v2) APIs will never return responseCards even when provided by a Lambda.

To work around this limitation in the Lex voice interface, you must return responseCards in a special session attribute.

See https://github.com/aws-samples/aws-lex-web-ui/blob/master/lex-web-ui/README.md#response-cards.

If your lambda returns a responseCard in sessionAttributes.appContext.responseCard as a serialized string, LexWebUi will display this response card.

bmanoj-aws commented 2 years ago

ok , so you mean to say we need add the below line into our lambda code response['sessionAttributes']['appContext'] = json.dumps({'responseCard': response_card}). But in lexv2 we donot have a attribute 'appContext'. and also if we capture the above ['sessionattributes']['appContext'] to which filed it should be mapped to. if possible Can you please share a working snippet for lexv2 which displays response cards in the above format i am using, that would helpful. Thanks

bobpskier commented 2 years ago

A V2 response will have both an array of "messages" and "sessionState" which holds "sessionAttributes". One of the sessionAttributes is called "appContext". The "appContext" string contains a json based payload. An example of a response with both "messages" and "sessionState" containing "appContext" is presented below.

"messages": [
    {
      "content": "Ok. Do you want to specify fuel in kilograms per second or pounds per second?",
      "contentType": "PlainText"
    },
    {
      "contentType": "ImageResponseCard",
      "imageResponseCard": {
        "buttons": [
          {
            "text": "kg",
            "value": "1"
          },
          {
            "text": "LBS",
            "value": "2"
          }
        ],
        "title": "Options"
      }
    }
  ],
  "sessionState": {
    "dialogAction": {
      "slotToElicit": "qnaslot",
      "type": "ElicitSlot"
    },
    "intent": {
      "confirmationState": "None",
      "name": "QnaIntent",
      "slots": {
        "qnaslot": null
      },
      "state": "InProgress"
    },
    "originatingRequestId": "dc91b362-a804-471b-8259-e1ada9a78193",
    "sessionAttributes": {
      "appContext": "{\"altMessages\":{\"ssml\":\"<speak>Do you want to specify each fuel rate input as kilograms per second or pounds per second, say 1 for kilograms or 2 for pounds.</speak>\",\"markdown\":\"Do you want to specify each **fuel rate input** as lbs per second or kg per second?\"},\"responseCard\":{\"version\":\"1\",\"contentType\":\"application/vnd.amazonaws.card.generic\",\"genericAttachments\":[{\"title\":\"Options\",\"buttons\":[{\"text\":\"kg\",\"value\":\"1\"},{\"text\":\"LBS\",\"value\":\"2\"}]}]}}",
      "connect_nextPrompt": "",
      "qnabot_gotanswer": "true",
      "qnabot_qid": "fuel.rate.type",
      "qnabotcontext": "{\"previous\":{\"qid\":\"fuel.rate.type\",\"q\":\"qid::fuel.rate.type\"},\"navigation\":{\"next\":\"\",\"previous\":[],\"hasParent\":true},\"elicitResponse\":{\"responsebot\":\"QNANumberNoConfirm\",\"namespace\":\"qnabotcontext.fuel.type\",\"chainingConfig\":\"c6baafb209ac66cb4597b2c44b5d379b79b7f81e86625296d20db18892799bf6d3e0c85a659eec259e9b4ddcb8d26e2epJ8C26XQPGI3GLbNA7sKDNsNM1e1d4FUdt6BKkwgaGDCNrpsax9ZH0Od8jVZKT+1iajUfn7468c6HUaKlyFlBEwzRhElNfu6ysRVqoWPfK+CW4dpm3V7YtpWX8FMbbuFIcAEqPdTkB5XEzbxfXjZDg==\"},\"landername\":{\"FreeText\":\"oliver 3\",\"Sentiment\":\"NEUTRAL\",\"SentimentPositive\":0.010331206023693085,\"SentimentNegative\":0.0016033018473535776,\"SentimentNeutral\":0.987984299659729,\"SentimentMixed\":0.00008130470814649016}}",
      "replay": "true",
      "undefined": "{}",
      "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:104.0) Gecko/20100101 Firefox/104.0"
    }
  }

In this example, "appContext" is a string value serialized from the following json based object:

  {
  "altMessages": {
    "ssml": "<speak>Do you want to specify each fuel rate input as kilograms per second or pounds per second, say 1 for kilograms or 2 for pounds.</speak>",
    "markdown": "Do you want to specify each **fuel rate input** as lbs per second or kg per second?"
  },
  "responseCard": {
    "version": "1",
    "contentType": "application/vnd.amazonaws.card.generic",
    "genericAttachments": [
      {
        "title": "Options",
        "buttons": [
          {
            "text": "kg",
            "value": "1"
          },
          {
            "text": "LBS",
            "value": "2"
          }
        ]
      }
    ]
  }
}

There are multiple elements that lex-web-ui supports in "appContext".

"altMessages" allows ssml and/or markdown to be defined that will be rendered.

In addition, "appContext" also supports V1 "responseCard" definition as shown above.

To display responseCards in a voice response in lex-web-ui, you must add "appContext" to your set of session attributes. This needs to be a string value from a serialized json object. The object serialized could be as simple as

{
  "responseCard": {
    "version": "1",
    "contentType": "application/vnd.amazonaws.card.generic",
    "genericAttachments": [
      {
        "title": "Options",
        "buttons": [
          {
            "text": "Button One",
            "value": "1"
          },
          {
            "text": "Button Two",
            "value": "2"
          }
        ]
      }
    ]
  }
}

Hope this helps.

bmanoj-aws commented 2 years ago

ok , so in that case app context is something that should be defined by us and added as a session attributes to the bot. can you confirm if all the below fields apart from appcontext and User agent which are mandatory from the below. Thanks.

"sessionAttributes": { "appContext": "{\"altMessages\":{\"ssml\":\"Do you want to specify each fuel rate input as kilograms per second or pounds per second, say 1 for kilograms or 2 for pounds.\",\"markdown\":\"Do you want to specify each fuel rate input as lbs per second or kg per second?\"},\"responseCard\":{\"version\":\"1\",\"contentType\":\"application/vnd.amazonaws.card.generic\",\"genericAttachments\":[{\"title\":\"Options\",\"buttons\":[{\"text\":\"kg\",\"value\":\"1\"},{\"text\":\"LBS\",\"value\":\"2\"}]}]}}", "connect_nextPrompt": "", "qnabot_gotanswer": "true", "qnabot_qid": "fuel.rate.type", "qnabotcontext": "{\"previous\":{\"qid\":\"fuel.rate.type\",\"q\":\"qid::fuel.rate.type\"},\"navigation\":{\"next\":\"\",\"previous\":[],\"hasParent\":true},\"elicitResponse\":{\"responsebot\":\"QNANumberNoConfirm\",\"namespace\":\"qnabotcontext.fuel.type\",\"chainingConfig\":\"c6baafb209ac66cb4597b2c44b5d379b79b7f81e86625296d20db18892799bf6d3e0c85a659eec259e9b4ddcb8d26e2epJ8C26XQPGI3GLbNA7sKDNsNM1e1d4FUdt6BKkwgaGDCNrpsax9ZH0Od8jVZKT+1iajUfn7468c6HUaKlyFlBEwzRhElNfu6ysRVqoWPfK+CW4dpm3V7YtpWX8FMbbuFIcAEqPdTkB5XEzbxfXjZDg==\"},\"landername\":{\"FreeText\":\"oliver 3\",\"Sentiment\":\"NEUTRAL\",\"SentimentPositive\":0.010331206023693085,\"SentimentNegative\":0.0016033018473535776,\"SentimentNeutral\":0.987984299659729,\"SentimentMixed\":0.00008130470814649016}}", "replay": "true", "undefined": "{}", "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:104.0) Gecko/20100101 Firefox/104.0" }

bmanoj-aws commented 2 years ago

i have added the couple of fields to the session attributes and tested it out , observed the below. When i invoke the lex webui via chat , i get the right response , but when i invoke the lex web ui via voice it displays the data multiple times. Attaching the code and screenshots below. Thanks.

return{
        "sessionState": {
            "sessionAttributes":{
                "appContext": "{\"responseCard\":{\"version\":\"1\",\"contentType\":\"application/vnd.amazonaws.card.generic\",\"genericAttachments\":[{\"title\": \"msg\",\"imageUrl\":\"https://imageslexdisplay.jpg\"}]}}",
                "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:104.0) Gecko/20100101 Firefox/104.0"
                },
            "dialogAction": {
                "type": "ElicitSlot",
                "slotToElicit": "orderselect"
            },
            "intent": {
                "name": event['sessionState']['intent']['name'],
                "slots": event['sessionState']['intent']['slots'],
            }
        },
        "messages": msg
    }

Lex Response with Chat:

withtext_responsecard

Lex Response with Voice:

withvoice_responsecard
bmanoj-aws commented 2 years ago

@bobpskier i was able to get the response cards via voice as well , but the lex doesnot read out the title name.

is there a way to enable an option, that helps out read the title name as well.

Attaching the Image screen Shot and code below. Let me know if anyother information is required. Thanks !

for i in emptylist1:
    temp = {"title":i['messagecontent'],"imageUrl":"https://imageslexdisplay.jpg"}
    msg1.append(temp)

sessiondata = '{"responseCard":{"version":"1","contentType":"application/vnd.amazonaws.card.generic","genericAttachments":'+msg1+'}}'

return{ "sessionState": { "sessionAttributes":{ "appContext": sessiondata, "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:104.0) Gecko/20100101 Firefox/104.0" }, "dialogAction": { "type": "ElicitSlot", "slotToElicit": "orderselect" }, "intent": { "name": event['sessionState']['intent']['name'], "slots": event['sessionState']['intent']['slots'], } }, "messages": msg }

image

bobpskier commented 2 years ago

@bmakumar The Lex service generates the audio response and as I mentioned the Lex service does not process responseCards when in voice mode. Anything in the response card is ignored by Lex itself in this scenario including the title. There is no option to enable this.

If you want the title spoken in audio, you'll need to add this in the returned messages when in voice mode. The input request in your Lambda can be used to check for the inputMode. If it is set to 'Speech' then append the titles to your message in the response. See https://docs.aws.amazon.com/lexv2/latest/dg/lambda.html#lambda-input-format.

if (request.inputMode === 'Speech') { ...... }
bmanoj-aws commented 2 years ago

@bobpskier Thanks for the above response will try that.

Also i am trying to display the response cards but i doesnot seem to work. Attaching the code below.

sessiondata ="{\"altMessages\":{\"ssml\":\"Is this the correct location: "+ location+" ?\",},\"responseCard\":{\"version\":\"1\",\"contentType\":\"application/vnd.amazonaws.card.generic\",\"genericAttachments\":[{\"title\":\"\",\"buttons\":[{\"text\":\"yes\",\"value\":\"yes\"},{\"text\":\"no\",\"value\":\"no\"}]}]}}",

return {
      "sessionState": {
        "sessionAttributes":{
                "appContext": sessiondata,
                "userAgent": data,
                "connect_nextPrompt": "Here are a few choices from the company catalog.",
                },
        "dialogAction": {
            "type": "ElicitSlot",
            "slotToElicit": "default_delivery_location"
        },
        "intent": {
            "name": event['sessionState']['intent']['name'],
            "slots": event['sessionState']['intent']['slots']
        }
    },

        "messages":[

            {
              "contentType": "SSML",
              "content": "<speak>Is this the correct location: <say-as interpret-as='digits'>"+ location+"</say-as> ?</speak>"
            },

            # {
            #   "contentType": "PlainText",
            #   "content": "Type yes or no."
            # },
            {
                "contentType": "ImageResponseCard",
                "imageResponseCard": {
                    "title": "Type yes or no",
                    "buttons": [
                        {
                            "text": "yes",
                            "value": "yes"
                        },
                        {
                            "text": "no",
                            "value": "no"
                        }
                ]
            }
      }
        ]
    }
bobpskier commented 2 years ago

@bmakumar I'm going to assume you are using nodejs/javascript. I would create sessiondata as a regular javascript object.

let sessiondata = {
    "altMessages": {"ssml": "Is this the correct location: " + location + " ?",},
    "responseCard": {
        "version": "1",
        "contentType": "application/vnd.amazonaws.card.generic",
        "genericAttachments": [{
            "title": "Options",
            "buttons": [{"text": "yes", "value": "yes"}, {"text": "no", "value": "no"}]
        }]
    }
};

Then when you assign sessiondata to appContext you would stringify the object.

"appContext": JSON.stringify(sessiondata)

title in genericAttachments I believe is a required field hence it has a string above. LexWebUi can be configured to hide the title.

bmanoj-aws commented 2 years ago

@bobpskier we using python to code and we are already making the session data as a string as shared from the above code reference.

sessiondata ="{"altMessages":{"ssml":"Is this the correct location: "+ location+" ?",},"responseCard":{"version":"1","contentType":"application/vnd.amazonaws.card.generic","genericAttachments":[{"title":"","buttons":[{"text":"yes","value":"yes"},{"text":"no","value":"no"}]}]}}",

bmanoj-aws commented 2 years ago

@bobpskier any help you can provide on the above issue.

Thanks !

bobpskier commented 2 years ago

@bmakumar I believe your session_data assignment is not valid above. It is much easier to create a valid python dictionary such as

session_data = {"altMessages": {"ssml": "Is this the correct location: " + location + " ?", },
                    "responseCard": {"version": "1", "contentType": "application/vnd.amazonaws.card.generic",
                                     "genericAttachments": [{"title": "", "buttons": [{"text": "yes", "value": "yes"},
                                                                                      {"text": "no", "value": "no"}]}]}}

And then make the assignment

     return {
        "sessionState": {
            "sessionAttributes": {
                "appContext": json.dumps(session_data),

Notice the use of json.dumps() to create a stringified value of the session_data dictionary. If you look at the return value for your function using json.dumps() you'll see a valid stringified version of appContext. Notice the escaped double quotes used throughout the string value for appContext.

 {"sessionState": {"sessionAttributes": {"appContext": "{\"altMessages\": {\"ssml\": \"Is this the correct location: test location ?\"}, \"responseCard\": {\"version\": \"1\", \"contentType\": \"application/vnd.amazonaws.card.generic\", \"genericAttachments\": [{\"title\": \"\", \"buttons\": [{\"text\": \"yes\", \"value\": \"yes\"}, {\"text\": \"no\", \"value\": \"no\"}]}]}}", "userAgent": "userAgentData", "connect_nextPrompt": "Here are a few choices from the company catalog."}, "dialogAction": {"type": "ElicitSlot", "slotToElicit": "default_delivery_location"}, "intent": {"name": "intentName", "slots": "slotInfo"}}, "messages": [{"contentType": "SSML", "content": "<speak>Is this the correct location: <say-as interpret-as='digits'>test location</say-as> ?</speak>"}, {"contentType": "ImageResponseCard", "imageResponseCard": {"title": "Type yes or no", "buttons": [{"text": "yes", "value": "yes"}, {"text": "no", "value": "no"}]}}]}