microsoft / BotFramework-Composer

Dialog creation and management for Microsoft Bot Framework Applications
https://docs.microsoft.com/en-us/composer/
MIT License
869 stars 372 forks source link

Removing local skills and then adding remote skills breaks Enterprise Assistant Template intents #8836

Open johnkm516 opened 2 years ago

johnkm516 commented 2 years ago

I am using the Enterprise Assistant template which includes the core bot, and the Calendar, People skills.

I first created and tested the bot successfully locally. I published all the skills and the core bot to Azure, and everything worked fine locally. However when testing externally the core bot couldn't connect to the skills properly because the skills were not configured to connect to the remote skills that I published to Azure.

Following this tutorial : https://docs.microsoft.com/en-us/composer/how-to-connect-to-a-skill?tabs=v2x

I deleted the local skills. The moment I did that I got errors in Composer on skills not being found (which I resolved by connecting remote skills), and warnings saying :

" line 55:0 - line 55:57 Orchestrator does not support ML entities. "

Despite adding remote skills to the root bot project and modifying the triggers to connect to the remote skills, Composer fails to detect any of the intent definitions. Trying to build the bot locally or publish results in an immediate error :

"[ERROR] line 105:0 - line 151:34: Features can only be added to intents that have a definition. Invalid feature definition found for intent GetDepartment"

I have not touched intents at all, all I did was follow the instructions. What am I doing wrong here?

lauren-mills commented 2 years ago

Hi @johnkm516, after you tested your Enterprise Assistant template bot locally, did you follow these steps from the Enterprise Assistant tutorial to create your skill manifests? https://docs.microsoft.com/en-us/composer/templates/enterprise-assistant-tutorial#next-steps

I'm not sure what's happening with the Orchestrator errors, but I'm trying to repro.

lauren-mills commented 2 years ago

After you connected to your remote skills, did you delete the duplicate onIntent trigger that was created?

johnkm516 commented 2 years ago

@lauren-mills That was indeed the issue! Didn't notice that composer added new triggers when adding the remote skills. Now I've successfully tested both my skills on Azure through the "Test through web chat" feature, and both skills work correctly (logging in with authentication with my AAD and all).

However when I test through emulator the root bot I don't get any replies from the remote skill, and looking through the logs in emulator it seems the bot is unable to connect to LUIS. I am launching emulator directly from the Composer, and running ngrok in cmd. Ngrok successfully receives messages.

The series of events : image

The LUIS trace : image

The skills work fine when tested on Azure Portal. How can I get the root bot to connect to the skills properly?

lauren-mills commented 2 years ago

When you test a remote skill with a local bot, you need to configure the Skill Host Endpoint in your bot settings to use an ngrok address that redirects to your localhost port where your root bot is running.

The host endpoint is where the hosted skill will send its response messages, so that is the recommended way of having it pass callback to a local root bot.

johnkm516 commented 2 years ago

@lauren-mills It is configured correctly. I am watching the cmd terminal with ngrok running and see 200 OK status of all the POST requests. My config for the root bot is also configured correctly, I even added any missing fields such as endpoint key, endpoint, etc. LUIS is obviously working correctly because the onintent for People or Calendar skill works fine, but the moment when I ask a question like, "Who is John?" through the root bot, the skills return unknown intent with the LUIS trace showing "Please add your LUIS service to enable reassigning".

The skills also work fine individually, when I go to azure portal and test in web chat the People skill or Calendar skill it works fine, so LUIS is correctly configured on the individual skills as well.

lauren-mills commented 2 years ago

Can you send a screenshot of your "Connect to a skill" action that is calling the people skill?

johnkm516 commented 2 years ago

@lauren-mills I have deleted the entire resource group, the entire project, and started from scratch, in order to test and check if the root bot works properly when everything is running locally. My original post states "I first created and tested the bot successfully locally." but my memory is wrong. I tested the skills successfully and didn't test the root bot locally before publishing (and I basically forgot about it when the bot wouldn't work when published on Azure).

I followed all the steps in the tutorial except one difference : I manually updated LUIS and QnA of the skills to the root bot's LUIS and QnA resource instead of using the LUIS and QnA resources that composer created for both skills. This has to be source of the issue, otherwise I followed all the other steps in the tutorial properly. Also, when provisioning resources, I went straight to the "Publish" tab after downloading a fresh enterprise assistant template, and provisioned the resources by adding a new publishing profile -> "Create new resources".

Here are the exact steps I took :

1) Download new Enterprise Assistant Template with a new project.

2) Go to "Publish" Tab.

3) Create a new publishing profile for the root bot, and choose "Create new resources" to provision all the resources required for the bot.

4) Repeat step 3) for the People skill and the Calendar skill.

5) Steps 3) and 4) provisions new QnA and LUIS resources specific to each skill. I want all the skills and root bot to share the same LUIS and QnA resource. So I update the QnA and LUIS resources of the skills to be the same as the root bot's LUIS and QnA resource in the next steps.

6) For the publishing profile of both the skills, I copy and pasted the "qna": {} and "luis": { } blocks from the root bot's json in "import resources" of the root bot's publishing profile. I also updated the "luisResource": field in each of the skills' publishing configs to the root bot's LUIS resource name.

7) For the configuration of both skills, I went into advanced settings view (json) and copy and pasted the qna and luis blocks there as well from the root bot. After copy and pasting, I changed the "name": field to "Calendar" and "People" respectively.

8) I then deleted unneeded QnA and LUIS resources.

9) I follow the tutorials for each skill to set authentication and Microsoft Graph permissions.

After making these changes I run all three bots. This is before publishing to Azure, running everything locally, without changing any of the "Connect to a skill" actions. When I test the Calendar bot and People bot individually on emulator, everything works fine. When I run the root bot and try to connect to the skills, both skills show the same "Please add your LUIS service to enable reassigning" response and consequently show unknown intent because the skills aren't properly receiving the LUIS response.

I don't understand what I'm doing wrong here. The reason why I wanted to use the same LUIS and QnA resources across the skills and root bot is so I don't have to switch resources every time I want to look at training LUIS / QnA for different skills / bots, and because I see no practical reason to separate the resources. But doing this causes this weird issue where the two individual skills work fine when running without the root bot, and the root bot also works fine when recognizing LUIS intents for the root bot, but the skills all return unknown intent because calling LUIS for the skills through the root bot returns the "Please add your LUIS service to enable reassigning" response.

I hope you can reproduce my issue using the steps outlined above.

EDIT :

Correction : After testing locally, I found that it's the skills that work fine with LUIS, but that the root bot's LUIS isn't working properly. When I prompt something like "Who is John" on the root bot, it's not LUIS that recognizes the intent and dispatches to the People or Calendar skill, it's the Orchestrator snapshot. Any LUIS requests on the root bot results in "Please add your LUIS service to enable reassigning" response. This is puzzling considering the fact that I literally changed nothing on the configuration of the root bot, on the contrary I copy and pasted from the root bot's config in the steps I outlined above.

All LUIS fields in the publishing profile of the root bot are identical to the skills, same as the root bot's config other than the name. Again, I didn't modify the root bot's config / publishing profile at all, I instead copied from the root bot's config to the skills.

EDIT 2 :

Ok I fixed the issue, at least running the root bot locally. The problem was that I never published the root bot (I didn't think that was required as I was testing everything locally). But it seems the skills and the root bot all has to be published at least once in order for composer to assign to the proper LUIS App ID's internally (which is weird since simply building the bots uploads new LUIS apps to the LUIS portal). Either this is a bug, or I think the docs should reflect that here : https://docs.microsoft.com/en-us/composer/templates/enterprise-assistant-tutorial It specifically says here that before you publish to Azure, to "run and test your bot locally". Now that the bot + skills are working fine locally, I am going to attempt to create skill manifests and attach the root bot to remote skills.

johnkm516 commented 2 years ago

@lauren-mills I have found the error / bug.

The problem is this : If you connect to a remote skill, Composer creates an orchestrator snapshot based on the root bot's .lu and .qna source files alone. This presents a critical issue where when you build the bot (whether it's for running locally or when publishing the bot to Azure), composer regenerates the orchestrator snapshot (.blu) file in the /generated folder based on the root bot's .lu and .qna files, which means the .lu file and .qna files of skills are ignored. I would have never found this difference if I hadn't, after recreating resource groups and redownloading this template like a dozen times, decided to zip up the project files as backup after successfully testing the root bot + skills locally.

The .blu file (orchestrator snapshot) when building the bot locally : Size : 245kb Contents : HansaeAssistant.en-us.txt

The .blu file (orchestrator snapshot) when building the bot while connected to remote skills : Size : 8kb Contents :
HansaeAssistant.en-us.txt

As you can see, the generated snapshot of the locally tested bot without connecting to remote skills has the correct snapshot including all the utterances in the Root bot, Calendar skill, and People skill. The generated snapshot when connecting the root bot to remote skills only has utterances of the Root bot, which isn't very many.

Therefore even without building / running the root bot locally before publishing to Azure, when composer builds the bot whether it's for publishing or for running locally, it regenerates a wrong snapshot and uploads it to Azure, ensuring the root bot will never actually route to the skills properly. This means internally, orchestrator in the root bot also doesn't route to the proper LUIS service properly, which is resulting in the "Please add your LUIS service to enable reassigning" response despite all the configuration being set up properly : image

I first ran the root bot that was connected to remote skills, which resulted in the above LUIS error and a series of unknown intents -basically any utterance that wasn't in the root bot was being routed to QnA (which seems to be the default case). Then, I copy and pasted the 245kb orchestrator snapshot from my backup zipped project into the /generated folder while the bot was still running, and voila! : image

Keep in mind the orchestrator snapshot has to be replaced right after building and running the bot, because after starting a conversation the orchestrator snapshot seems to be held in memory and not reloaded every time the user inputs (obviously).

I can't believe the amount of time I wasted recreating resources and downloading this template over and over like a dozen times only for it to not have been my fault to begin with. The design decision to separate remote skills with local skills as if they are separate entities is questionable. Remote skills and local skills are essentially the same thing, except one is a published version (production) and the other a local version (dev build) of the same project. What happens if you want to modify the skills? The docs guide the user into deleting the local skill and adding the remote one, but this part I think is also wrong if the user would ever want to customize the skills later. I ended up not deleting local skills, so that the "Connect To A Skill" option shows both local and remote endpoints : image

and then deleting duplicate skill keys generated by composer in the advanced view of the root bot config :

"skill": {
   // Having both the local skills and remote skills results in composer creating duplicate Calendar and People entries here
    "Calendar": {
      "endpointUrl": "<local skill messaging endpoint>",
      "msAppId": "<skill MS App ID>"
    },
    "People": {
      "endpointUrl": "<local skill messaging endpoint>",
      "msAppId": "<skill MS App ID>"
    }
  }

Regardless of my opinion above, because Composer treats remote skills and local skills as separate entities (and doesn't download .lu and .qna source files from remote skills), when it builds the root bot it regenerates Orchestrator snapshot without any of the skills in it resulting the bot to break. I can fix this issue by just overwriting the snapshot with my backup snapshot that I built with local skills on the Azure server via Kudu (which I did and now the root bot works beautifully on Azure), but this seems like a high priority issue.

johnkm516 commented 2 years ago

@lauren-mills Sorry for the continuous posts. I'm trying to find out if there's a "correct" way to stop the template from breaking when adding remote skills to the root bot.

After digging through the project files, I uploaded the project that worked successfully locally (before adding any remote skills) onto a private git repository, and then began tracking changes as I started adding remote skills in Composer.

Composer automatically generates and changes the root bot's .lu file in /language-understanding/en-us/ as I'm making changes, not even when I build the project.

The moment I add the remote skill, because the name of the skill is the same on the manifest, composer automatically deletes the utterances in this file under intents with the same name. So if the original .lu file in the root bot looked something like this :

# Feedback
- i want to give feedback
- i have a problem
- something is  broken
- this is a bad experience
- fill out survey

# Calendar
- cancel my 3 o'clock
- cancel my 5:00
- cancel my event with chris and hyunjin with a subject of breakfast

After adding the remote skill # Calendar itself is deleted until only intents defined in the root bot is left. Perhaps this is by design, because when you try to add the remote skill Composer does prompt the user to "Add or edit phrases to trigger Calendar skill".

Since Composer automatically overwrites the .lu in the root bot file for intents with the same name, as a workaround I decided to edit the skill name in the manifest to "Calendar (Prod)" and People (Prod)" respectively. This way Composer doesn't overwrite the intents in the root bot's .lu file, and instead appends the file with new intents, "Calendar (Prod)" and People (Prod)". After adding the renamed remote skills, I delete the appended data, and then modify the "Connect To A Skill" action to connect to "Calendar" and "People" and switch the skill endpoint to the Manifest Endpoint :

image

This way, as long as I don't try to add the remote skill again the root bot's .lu file isn't modified, and I shouldn't have to add the remote skill again after doing it once.

But then this means if I modify utterances in either of these skills I have to manually update the root bot's .lu file. I could just let Composer overwrite the original intent utterances in the root bot's .lu file when importing the remote skill (I would have to modify the utterances to remote lines showing ML entities), but now there's a new problem : Composer doesn't recognize utterances for the Calendar skill due to the way the Calendar skill is structured.

The People skill contains a single .lu file in language-understanding/en-us .

The Calendar skill contains a set of .lu files separated into folders, with references to these files in the main .lu file :

image

 > General Intent Imports
[Cancel](cancel.en-us.lu)
[Help](help.en-us.lu)
[Logout](logout.en-us.lu)

> Calendar Intent Imports
[CancelEvent](cancelEvent.en-us.lu)
[CreateEvent](createEvent.en-us.lu)
[GetAvailabilityBreaks](getAvailabilityBreaks.en-us.lu)
[GetAvailabilityFirst](getAvailabilityFirst.en-us.lu)
[GetAvailabilityLast](getAvailabilityLast.en-us.lu)
[GetEventAttendees](getEventAttendees.en-us.lu)
[GetEventDateTime](getEventDateTime.en-us.lu)
[GetEventLocation](getEventLocation.en-us.lu)
[GetEvents](getEvents.en-us.lu)
[RespondAccept](respondAccept.en-us.lu)
[RespondDecline](respondDecline.en-us.lu)
[RespondTentativelyAccept](respondTentativelyAccept.en-us.lu)
[SetAttendeesAdd](setAttendeesAdd.en-us.lu)
[SetAttendeesRemove](setAttendeesRemove.en-us.lu)
[SetDateTime](setDateTime.en-us.lu)
[SetDescription](setDescription.en-us.lu)
[SetDuration](setDuration.en-us.lu)
[SetLocation](setLocation.en-us.lu)
[SetOnlineMeetingAdd](setOnlineMeetingAdd.en-us.lu)
[SetOnlineMeetingRemove](setOnlineMeetingRemove.en-us.lu)
[SetTitle](setTitle.en-us.lu)
[UpdateEvent](updateEvent.en-us.lu)

# None
- 1 hour
- 3pm to 4pm
- first
- john and sarah
- last
- monday
- project review
- rainbows and butterflies
- standup
- team standup
- today
- tomorrow
- tuesday at 4pm
- tuesday to wednesday
- tyler, tonya, and abby
- yes
- no

Composer doesn't take into account these references when importing entities so this must be done manually. None of these details are mentioned in the Enterprise Template tutorials, and I don't think it's intended design for Composer to be overwriting utterances in the .lu file in the root bot when doing so breaks the root bot's connection to skills.

Suggestions :

1) Composer shouldn't automatically overwrite .lu files when linking skills with Orchestrator. If the intended design was to have Composer automatically generate utterances from linked skills it needs to be doing that properly before Composer starts making automatic changes.

2) Importing skills results in "Orchestrator does not support ML entities.". Composer should automatically remove ML entity definitions and make other automatic changes to utterances before overwriting the root bot's .lu file, because the bot won't build due to these errors.

3) Composer should also detect sets of .lu files (like how Calendar skill's utterances are structured) properly when importing skills. Currently Composer shows zero utterances when importing the Calendar skill.

lauren-mills commented 2 years ago

I do think this is a bug or rather a mis-designed aspect of remote skills that it doesn't properly handle the skill's LU. I will review your messages and see if I can come up with any consistent work arounds for this, or more details so we can get them added to the backlog.

One note on the Enterprise Templates as I had a hand in building those, they were not built to rely on remote skills, but local skills. When you use the Calendar and People skills included in the template out of the box, you should not run into these particular issues.

carlosscastro commented 2 years ago

@lauren-mills any update on this issue?

lauren-mills commented 2 years ago

This appears to be more of a design issue with how Composer tries to automatically "do the right thing" when adding remote skills with Orchestrator, but actually ends up breaking John's scenario. This should be added to the backlog as a scenario that should be reviewed and revisited.

As for the Enterprise Template specifically, my advice for moving forward is to not add Calendar and People as remote skills, but to keep them as skills in the same bot project as they were originally designed as that does not introduce this issue. Alternatively, the intents that are added to Orchestrator are controlled by the remote skill manifest. If you do not want those intents added, excluding them from the manifest should prevent that behavior.