microsoft / BotBuilder-Samples

Welcome to the Bot Framework samples repository. Here you will find task-focused samples in C#, JavaScript/TypeScript, and Python to help you get started with the Bot Framework SDK!
https://github.com/Microsoft/botframework
MIT License
4.35k stars 4.87k forks source link

[New Sample] Request for a speech-focused sample #1981

Open stevengum opened 4 years ago

stevengum commented 4 years ago

Is your feature request related to a problem? Please describe. We currently enable speech out of the box for the C# Echo Bot and Core Bot samples and generators. These samples are our "getting-started" samples and don't delve into the nuances of the protocol with speech.

Now that Direct Line Speech is GA, we should have a speech-focused sample.

Describe the solution you'd like A sample focused on speech (which may be through a headless device) should be created.

Features of the sample:


FYI @ryanlengel, @darrenj, @lauren-mills, @gabog who have experience in designing headless device solutions. Are there any "gotchas" that should be discussed in a speech-focused sample?

[enhancement]

gabog commented 4 years ago

Hi @stevengum, here are some notes from my end:

On the understanding side.

This is all I can think of so far. Will update this post if I can think of anything else.

darrenj commented 4 years ago

Gabo has most things covered.

# NewUserIntroCard
[Activity
    Text= Some text
    Speak=Speech friendly response
]
stevengum commented 4 years ago
  • Sample will need to have the steps required to enable websockets (for direct line speech) on the app service. We do this automatically as part of VA and the ARM template.

The C# Core bot and Echo bot ARM templates were updated to support WebSockets, and the Startup.cs was updated to include the necessary app.UseWebSockets(); call. So we should be set here in regards to enabling WebSocket usage from the bot and on the App Service; we just need to mirror this work in the Speech-first sample.

There is work to be done on the Resource Provider to enable creating of the DLS channel via ARM templates and Azure CLI which I believe @DDEfromOR is working on.

  • There is a test harness for speech, you can see this and some other instructions here

We do need to update the Core and Echo bot READMEs to mention the test DLS client and the Speech SDKs.

  • The speech track (Speak property) doesn't need to match the Text property and it some cases it will be very different based on the channel. A channel with a screen could say: here are your appointments for today, a channel without it would probably ready the appointments out loud.

For DLS, the current behavior is that the Speak property needs to be set, the channel does not use the Text property from Activity for Speech generation.

  • Enumerations of suggested actions, in some cases we created logic to lead suggested actions out loud in the form of "You can say X, Y or Z" or "You can say X, Y and Z", the default would be to read read the list without and or or which sounds very wierd.

For non-headless devices/UIs (headful? heady?) it is important to preserve any use of GUI as applicable. However, if possible I think that building for one channel (DLS, Web Chat with Speech, or Cortana) and then generalizing is the better approach. We've seen this approach with MS Teams which has a lot more

  • Not sure if this changed in webchat lately, but adaptive cards have a Speak property that is not used by wechat and may be confusing to some devs.

@compulim?

ryanisgrig commented 4 years ago

For reference we have a tutorial on enabling DLS with the VA at https://microsoft.github.io/botframework-solutions/clients-and-channels/tutorials/enable-speech/1-intro/

Most of the steps are turning on the resources for the bot to work but there is some on how to change the voice with SSML.

johnataylor commented 4 years ago

We agreed to postpone major new samples until after we target dotnet 3.1

cleemullins commented 4 years ago

@johnataylor What are we doing with this? Can Monica, Michael, Ashely, or Eric drive this one?