BotBuilderCommunity / botbuilder-community-dotnet

Part of the Bot Builder Community Project. Repository for extensions for the Bot Builder .NET SDK, including middleware, dialogs, recognizers and more.
MIT License
281 stars 173 forks source link

How to handle names (locations, cities...) with the alexa adapter? #239

Closed C0d1ngJammer closed 4 years ago

C0d1ngJammer commented 4 years ago

Hello there, my bot is using phrases like "Get the next bus arriving at <'station'>" or "Connection from <'station1'> to <'station2'>". When defining the phrase as followed:

{
    "interactionModel": {
        "languageModel": {
            "invocationName": "<YOUR SKILL INVOCATION NAME>",
            "intents": [
                {
                    "name": "GetUserIntent",
                    "slots": [
                        {
                            "name": "phrase",
                            "type": "phrase"
                        }
                    ],
                    "samples": [
                        "{phrase}"
                    ]
                },
                {
                    "name": "AMAZON.StopIntent",
                    "samples": []
                }
            ],
            "types": [
                {
                    "name": "phrase",
                    "values": [
                        {
                            "name": {
                                "value": "<EXAMPLE PHRASE>"
                            }
                        },
                        {
                            "name": {
                                "value": "<EXAMPLE PHRASE>"
                            }
                        },
                        {
                            "name": {
                                "value": "<EXAMPLE PHRASE>"
                            }
                        }
                    ]
                }
            ]
        }
    }
}

(from Github Amazon Alexa Adapter )

Alexa, of course, doesnt understand the stations. How to define, in this case, the stations using the "phrase"? I know that I can archive that using a custom intent (where the stations are listet), but then the phrase is empty and I'll have to rewrite the "AlexaRequestMapper"-Class ("feature/adopt-alexadotnet" branche) to work with my needs. Which might be the wrong way to go with.

The model wich ofc. "doesnt" work or in different words is not passing the data via slot "phrase":

{
    "interactionModel": {
        "languageModel": {
            "invocationName": "mein test",
            "intents": [
                {
                    "name": "GetUserIntent",
                    "slots": [
                        {
                            "name": "phrase",
                            "type": "phrase"
                        },
                        {
                            "name": "stations",
                            "type": "Station"
                        }
                    ],
                    "samples": [
                        "{phrase}"
                    ]
                },
                {
                    "name": "AMAZON.StopIntent",
                    "samples": []
                }
            ],
            "types": [
                {
                    "name": "phrase",
                    "values": [
                        {
                            "name": {
                                "value": "Widdumhof"
                            }
                        },
                        {
                            "name": {
                                "value": "Miedelsbach-Steinenberg"
                            }
                        }
                    ]
                },
                {
                    "name": "Station",
                    "values": [
                        {
                            "name": {
                                "value": "Widdumhof"
                            }
                        },
                        {
                            "name": {
                                "value": "Miedelsbach-Steinenberg"
                            }
                        }

                    ]
                }
            ]
        }
    }
}

How Iam supposed to handle that kind of problem? Do you have any tips? Thanks alot.

-C0d1ngJammer

JaredLLewis commented 4 years ago

If you'd like to stay in the realms of Alexa, you may want to consider adding a middleware to transform the intent request from Alexa into message activity(See this for a code example link). You will need to add a new, well defined, Intent with good utterances (so the NLP doesn't get confused with the GetIntent).

If you are going to add a lot of complex intents, you may want to stay away from Alexa's intent management system and opt for something else like LUIS that is more updated and fleshed out. You can send your phrases via API.

garypretty commented 4 years ago

Hi @C0d1ngJammer

Really sorry for the delay in coming back to you on this issue. In the first instance I would suggest that you add the stations entity (either as a list entity or a simple entity) into a LUIS model and run the utterances through that. This will allow you to continue to use the default setup for the adapter and still extract your stations on incoming activities.

If this isn't working, maybe because Alexa is simply mid-understanding the words (so by the time they get to you they are not the actual station names), then you might need to consider adding your own custom slot as you suggested. You would not need to totally re-write the AlexaRequestMapper though. To do this you can override the method RequestToActivity (https://github.com/BotBuilderCommunity/botbuilder-community-dotnet/blob/9f841f495a1faba02ec7a4c3505e7ffb65d96562/libraries/Bot.Builder.Community.Adapters.Alexa/AlexaAdapter.cs#L163-L166) on your AlexaAdapterWithErrorHandler class.

This method just calls the RequestToActivity method on the request mapper (https://github.com/BotBuilderCommunity/botbuilder-community-dotnet/blob/9f841f495a1faba02ec7a4c3505e7ffb65d96562/libraries/Bot.Builder.Community.Adapters.Alexa.Core/AlexaRequestMapper.cs#L39-L69), so you have the opportunity to change the default functionality that way.

Please let me know how you get on with this and I or @NickEricson would be happy to assist further.

C0d1ngJammer commented 4 years ago

Hello, thank you for your responses. @JaredLLewis: How can I stay away from alexas Intent management system? If alexa doesnt understand the user and sends the wrong phrase to my application, luis isnt able to handle the request.

@garypretty: I might wanna break down the main thing, Iam trying to archive here:

Iam trying to create a Bot which is able to handle multiple channels with the same core logic. The Core-Bot Application is always using LUIS, to understand the intent of the user. At this time and moment I want it to be connect with alexa, thats why Iam using the Alexa Community Adapter (feature/adopt-alexadotnet" branche).

The problem here is the voice recognition. Alexa needs to have proper Intents to understand what the user is saying e.g. station names. But than I would duplicate my LUIS-functionality, Lets say I decide to dublicate my LUIS logic for Alexa (with a tool to convert luis-intents to alexa-intens). With that, I would need to delete the "phrase"-logic from the alexa community adapter. And need to tell my application to not build the core-Bot LUIS functionality.

Right now, Iam trying to implement the luis-intents to alexa-intents so that alexa understand the words/names the user is saying, not to get the intent - because my core-Bot should use the "phrase"-slot, pass it to luis and than returing the intents. I thought this might be easier and easily possible, but if I use Intents (e.g. stations) in Alexa, the "phrase" is empty.

Should I try to remove the "phrase" logic from alexa community adapter and than tell my application that the Intent-Request is handled by alexa and not by luis?

Iam not shure if this is the best way to do that. Any help appreciated - Thanks!

-C0d1ngJammer

NickEricson commented 4 years ago

@C0d1ngJammer - I want to be sure I understand.

When the users of your skill say a phrase in/to Alexa it does a Speech-to-Text conversion and sends it to your bot. In your bot you use Luis to do intent recognition (same as done for all the other channels you support). However, without adding intents into the Alexa portal that Speech-to-Text conversion is not working. The incorrect text is being forwarded to your bot.

Do you have a specific example of this incorrect conversion and the additions to the Alexa skill that make it able to properly convert the speech to text?

C0d1ngJammer commented 4 years ago

@NickEricson That is exaclty what Iam trying to archive. The example of this incorrect conversion is the last json example (from the alexa-skill) in my question.

If the user says e.g. "I want from Widdumhof to Miedelsbach-Steinenberg" alexa doesnt recognize "Widdumhof" and "Miedelsbach-Steinenberg" as station names. Everytime the user ask the same question, alexa understands it different, most of the time.

I need a way to add the station names as dictionary for alexa to lookup...

Thanks alot. -C0d1ngJammer

garypretty commented 4 years ago

@C0d1ngJammer can you provide a couple of examples of what Alexa transforms the text into when you say it doesn't recognize the station names please? I.e. The activity.Text that is received by your bot.

garypretty commented 4 years ago

Hi @C0d1ngJammer - just following up again to see if you can provide any examples for us as per my message above?

C0d1ngJammer commented 4 years ago

Hello @garypretty, sorry had vecations :).

here is the example as mentioned. The invocation name is "github test" and is using the above json.

1. The user wants from station"oberhof" to station "wallgraben" He asks alexa following (spoken in german): frage github test verbindung von oberhof nach wallgraben (ask github test connection from oberhof to wallgraben) Alexa understands following (phrase, which is than sent to the microsoft-bot): frage github test verbindung von oberhoch nach ball graben (ask github test connection from oberhoch to ball graben)

2. The user wants from station "filderbahnstraße " to station "wallgraben" He asks alexa following (spoken in german): frage github test verbindung von filderbahnstraße nach wallgraben (ask github test connection from "filderbahnstraße" to "wallgraben") Alexa understands following (phrase, which is than sent to the microsoft-bot): frage github test verbindung von tille bahn straße nach "ball graben" (ask github test connection from "tille bahn straße" to "ball graben") frage github test verbindung von tille bahn straße nach ball graben

As us can see, in this case I cant do anything with the phrase or "station names". When I setup proper intents in alexa (as I have in luis) e.g. connection from {station} to {station} Than the phrase is empty (I cannot get the spoken phrase, only the intent).

-C0d1ngJammer

garypretty commented 4 years ago

Hi @C0d1ngJammer - thanks for the examples. As I suspected, this is a limitation on the use of a single slot to grab all of the text spoken by the user. I did some tests and one thing you could do is to use custom slots for Alexa, as you stated, and add some additional code to your adapter to transform them into message activities, so that you don't need to change your core bot code.

For example, within your AlexaAdapterWithErrorHandler.cs class, you can override the RequestToActivity method as shown below. When you use custom slots, the adapter will transform the incoming request into an Event activity. The code below will allow the adapter to do that transform, but will then check the original request to see if it is a known custom intent that you have created. If it is, here you can set the activity to be a Message and define the text you want to send to your bot, using the slot values identified by Alexa. Obviously, you would need to alter the code for your needs.

        public override Activity RequestToActivity(SkillRequest request)
        {
            var activity = base.RequestToActivity(request);

            if(activity.Type == ActivityTypes.Event && request.Request is IntentRequest intentRequest)
            {
                switch (intentRequest.Intent.Name)
                {
                    case "UserJourney":
                        activity.Type = ActivityTypes.Message;
                        activity.Text = $"frage github test verbindung von {intentRequest.Intent.Slots?["station"]?.Value}";
                    break;
                    default:
                        break;
                }
            }

            return activity;
        }

Does that make sense for you?

C0d1ngJammer commented 4 years ago

@garypretty Thank you. Now that I definitely know, that there is no "workaround" to this Problem (caused by the structure of Amazon Alexa and my usage) I'll go as you suggested. I gonna mirror my luis intents to amazon alexa intents, than I'll tell my core bot to not use the luis intent when a request is recived from Alexa. Iam converting all requests (alexa or not) by a custom converter class which outputs me the intents, This should work and I have some ideas.

The "RequestToActivity"-Code helped me alot!

Again thank you for your patience -C0d1ngJammer

garypretty commented 4 years ago

@C0d1ngJammer thanks. I am pleased we managed to help. Please let us know how you get on with this - I am keen to know.

C0d1ngJammer commented 4 years ago

Hello @garypretty

here is my current (working prototype) solution: First I have created a tool which clones my Luis Intents to Amazon Alexa so they are the "same". (By the way I'll have to pay luis.ai to be able to add more than 50 stations)

In your example you are overriding the Method RequestToActivity in AlexaAdapter. The problem with that is, that I would have to manage my intents twice in code and it is also getting progressed twice.

The solution which I came up, with does work nice and Iam happy with it, looks as follows: So what I have done is modifying the method RequestToActivityin the class AlexaRequestMapper.

Before the change: When using intents, "phrase" was not contained as intent-slot and the skillRequest was processed without the activity, which is than not processed by the bot. After the change: It can now handle phrase and non phrase requests. (Maybe using another Alexa-Option "useBuildinIntent" would be much better)

If we recive a Alexa-Intent it is passed to RequestToMessageAlexaActivity where iam basically...

public Activity RequestToActivity(SkillRequest skillRequest)
{
    if (skillRequest.Request == null)
    {
        throw new ValidationException("Bad Request. Skill request missing Request property.");
    }

    switch (skillRequest.Request)
    {
        case IntentRequest intentRequest:
            if (intentRequest.Intent.Slots != null && intentRequest.Intent.Slots.ContainsKey(_options.DefaultIntentSlotName))
            {
                return RequestToMessageActivity(skillRequest, intentRequest);
            }
            else
            {
                if (intentRequest.Intent.Name == "AMAZON.StopIntent")
                {
                    return RequestToEndOfConversationActivity(skillRequest);
                }

                //Handle Alexa intents without having to use "phrase". Returning AlexaActivity inherited by Activity
                //Might need a option: "use alexa Intents"
                return RequestToMessageAlexaActivity(skillRequest, intentRequest);
                //old code: Doesnt pass any activity or in other words doesnt get processed by the bot
                //return RequestToEventActivity(skillRequest);
            }
        case LaunchRequest launchRequest:
            return RequestToConversationUpdateActivity(skillRequest);
        case SessionEndedRequest sessionEndedRequest:
            return RequestToEndOfConversationActivity(skillRequest);
        default:
            return RequestToEventActivity(skillRequest);
    }
}

...returning an AlexaActivity (custom class with an "IntentRequest" property). Iam copying the activity to my new AlexaActivity class which inherits Activity. The activity.Text is "empty" because we cant recive the spoken text in this case

private AlexaActivity RequestToMessageAlexaActivity(SkillRequest skillRequest, IntentRequest intentRequest)
{
    var activity = Activity.CreateMessageActivity() as Activity;

    activity = SetGeneralActivityProperties(activity, skillRequest);
    activity.Text = "<Intent is getting passed>";
    activity.Locale = intentRequest.Locale;
    //Copying properties from activity to alexaActivity so we can pass the intentRequest. Better method?
    return CopyPropertiesTo(activity, new AlexaActivity(intentRequest));
}

The following method is getting called everywhere, where I want to handle the intents. With this method Iam casting incomming activity to AlexaActivity, when the activity is coming from alexa. Than I will use the AlexaActivity and convert it to LuisResponse (own class) which I than can use to process my intent core logic.

When the activity is not from alexa, Iam grabbing the spoken text, sent it to luis.ai, receiving the luisResponse, which I than can use to progress my intent core logic

protected static async Task<IntentProgressResponse> ProcessIntent(WaterfallStepContext stepContext)
        {
            //If the request is coming from alexa. Activity.Text is empty but we have our custom intentActivity
            if (stepContext.Context.Activity.ChannelId == "alexa")
            {
                var intentActivity = (AlexaActivity)stepContext.Context.Activity;
                //Value can be null and causes an exception - To Check
                var entities = intentActivity.IntentRequest.Intent.Slots.Where(a => a.Value.Value != null).Select(i => new EntityDto()
                {
                    Entity = (i.Value).Value,
                    Type = (i.Value).Name
                }).ToList();

                var intent = new IntentDto()
                {
                    Intent = intentActivity.IntentRequest.Intent.Name,
                    //not ready yet...iam faking here too
                    Score = 1
                };

                var luisResponse = new LuisResponseDto()
                {
                    Entities = entities,
                    Intents = new List<IntentDto>() { intent },
                    TopScoringIntent = intent,
                    //we dont get anything - we are not using the phrase
                    Query = "<None because we are using alexa intents>"
                };
                //This just uses the fake luisResponse and processes and returns me the affcted intents (its not getting progressed by luis.ai itself!)
                return _luisHandler.GetProgress(luisResponse);
            }
            else
            {
                //Passing activity text which is than progressed by luis.ai. Afterwards it is using the luisResponse and processes and returns me the affcted intents
                return await _luisHandler.GetProgress(stepContext.Context.Activity.Text.ToLowerInvariant());
            }
        }
    }

And it working very well. As I said this is a prototype and I think it should be handled by the community.Alexa.core logic as default (e.g. with option "useBuildinIntent"). It definitely needs optimization but I hope it shows you what I did.

I can provide more information, if wanted.

-C0d1ngJammer

NickEricson commented 4 years ago

Hi @C0d1ngJammer

It is great to hear you got things working. I'm curious what the values of the Entity Values and Names look like for you. For example if someone says, "frage github test verbindung von oberhof nach wallgraben" what do you get in the IntentRequest?

There may be a way we could map these into the Entities property of the base Activity or maybe into a Semantic Action.

Entities: https://github.com/microsoft/botframework-sdk/blob/master/specs/botframework-activity/botframework-activity.md#entities Semantic Action: https://github.com/microsoft/botframework-sdk/blob/master/specs/botframework-activity/botframework-activity.md#semantic-action

garypretty commented 4 years ago

I would also be super curious to know what your conversion tool looks like and if this might be something we could share with the community? Sounds super useful.

C0d1ngJammer commented 4 years ago

Hello, sorry for my late response but iam just doing this for fun in my left over spare time. @NickEricson the original Alexa-IntentResponse as json looks as follows:

"request":{
      "type":"IntentRequest",
      "requestId":"amzn1.echo-api.request.bd5ffbb4-567e-4587-96b4-ab57617cd3f5",
      "locale":"de-DE",
      "timestamp":"2020-06-24T19:48:44Z",
      "intent":{
         "name":"TravelInformation",
         "confirmationStatus":"NONE",
         "slots":{
            "stationEnd":{
               "name":"stationEnd",
               "value":"wallgraben",
               "resolutions":{
                  "resolutionsPerAuthority":[
                     {
                        "authority":"amzn1.er-authority.echo-sdk.amzn1.ask.skill.5c02e761-882a-4924-9af5-c42d99c08c62.station",
                        "status":{
                           "code":"ER_SUCCESS_MATCH"
                        },
                        "values":[
                           {
                              "value":{
                                 "name":"Wallgraben",
                                 "id":"de:08111:355"
                              }
                           }
                        ]
                     }
                  ]
               },
               "confirmationStatus":"NONE",
               "source":"USER",
               "_tokens":[

               ]
            },
            "stationStart":{
               "name":"stationStart",
               "value":"oberhof",
               "resolutions":{
                  "resolutionsPerAuthority":[
                     {
                        "authority":"amzn1.er-authority.echo-sdk.amzn1.ask.skill.5c02e761-886a-4924-9af5-c49d99c08c67.station",
                        "status":{
                           "code":"ER_SUCCESS_MATCH"
                        },
                        "values":[
                           {
                              "value":{
                                 "name":"Oberhof",
                                 "id":"de:08116:4070"
                              }
                           }
                        ]
                     }
                  ]
               },
               "confirmationStatus":"NONE",
               "source":"USER",
               "_tokens":[

               ]
            },
            "vehicle":{
               "name":"vehicle",
               "confirmationStatus":"NONE",
               "_tokens":[

               ]
            }
         }
      }
   }

@garypretty the tool currently only uses the luis api to connect and gather/update some data: https://westus.dev.cognitive.microsoft.com/docs/services/5890b47c39e2bb17b84a55ff/operations/5890b47c39e2bb052c5b9c26 Right now Iam using the data outputed from the luis.ai api to create a json which I than manually upload to alexa. Amazon does not seem to have any Api for Alexa itself (didnt find anything). The "mapping" is done in my code and I wouldn't descripe it as mapping. I paused the development for the tool because I soon have to migrate to azure (1.10.2020 deadline from microsoft). And I dont have a credit card to test the API with azure...so Iam stuck there until Ill find a solution or they implement paypal :(.

Currently Iam experimenting with the a generic arichitecture for my bot framework...its still early development.

-C0d1ngJammer

C0d1ngJammer commented 4 years ago

Little update: Now having access to azure (using a credit card). And there seems to be a API for Alexa https://developer.amazon.com/de-DE/docs/alexa/smapi/reference-based-catalog-management.html. Application is close to be a prototype :)...yet I have to merge my code with the latest Alexa commits and find a bug with the ChoicePrompt...