microsoft / botframework-sdk

Bot Framework provides the most comprehensive experience for building conversation applications.
MIT License
7.49k stars 2.44k forks source link

DCR: Simplify driver code for running a dialog + State improvements + Add MVC support for C# SDK #5235

Closed yochay closed 5 years ago

yochay commented 5 years ago

The following DCRs suggest several changes that are somewhat related to each other, which is the reason they are listed together. These are mostly based on existing samples Sample #30 and Sample #42 that were used to collect feedback. Content for this DCR also provided by @johnataylor and @darrenj, and is based on feedback from number of developers across MS.

background

Integration layer

The current integration layer in the BF SDK works well, but has certain limitations (for example designed to support a single bot per app) and is somewhat complex and relatively new (compared to MVC / WebAPI). Therefore, DCR#1 suggests simplifying the integration layer and use common programming paradigms that have been used (and still are) and are familiar to our developers ecosystem. For C# that means adding support for MVC/ Web API.

Dialog Stack

The current dialog system is the primary method (tools/system) the SDK offers developer for building ‘conversation’. It is complex, includes multiple terms, which sometime overlap, and is not easy to use. For example, one needs to call ContinueDialog before calling Begin– see following code.

var dialogContext = await dialogs.CreateContextAsync(turnContext);
var results = await dialogContext.ContinueDialogAsync();
                if (results.Status == DialogTurnStatus.Empty)
                {
                    await dialogContext.BeginDialogAsync("root");
                }

To reduce some of the current complexity, DCR#2 suggests a way to encapsulate everything needed to run a Dialog stack into one Run method. That DCR coupled with the MVC update further abstracts dialog execution complexity.

State

Accessors

V4 state implementation was, at least, partially designed to protect developers against middleware performing modifications that could side-affect other middleware. StateAccessors where introduced to provide a way to safely create namespaces and apply policies for given objects. To date there no customers scenarios that require such level of isolation or policy for access control. Furthermore, given the changes in focus around the middleware in the SDK and putting this more in the hands of the developer the complexity accessors introduces is not needed in the scenarios encountered to date.

The current state implementation, using StateAccessors is very hard for Developers to understand and adds significant extra layers/latency. Overwhelming developer feedback, across multiple teams in MS and outside, suggests current accessors are too complex. A simpler model would reduce the burden on new developers on our platform.

Scale out and race conditions

As per this issue there are race-conditions with our current state model. Bots written using our State implementation today that end up receiving multiple messages (in quick succession) run into problems around State too.

Furthermore, the SDK implementation allows multiple replies from the bot to a single utterance from the customer. In case where the bot functionality requires state-changes accompanied with multiple request/ replies; failure to handle such state changes properly, can (and will) result in an inconsistent bot behavior.
image

To this end, when a give bot scales out (runs on number of compute instances) such issues are compounded. With high-volume scale-out scenarios especially those involving devices with differing connectivity the race-condition type of scenario will increase in regularity causing “conflicts” with state versioning. As a Bot developer one should be able to handle such potential conflicts gracefully – Developers need to understand the state of their bot and perform/validate any ‘merge’ or state store conflict, which might be required if the state has been changed through another request.

The suggest - DCR#3, suggests a clear signature (and seperation) between executing logic (changing state) and managing state (read/write to state store) that can be seamlessly integrated with the dialog stack as shown in C# sample @42. There are several customers already using this strategy including (CDL, Analog, CCI, TestBot, etc.) This chance will also enable developers to buffer responses, which is a fundamental requirement in some cases, like with speech integration. The suggested DCR will enable (via config vs. code) to buffer response and control outbound Activity features (like InputHint ) appropriately as some channels require.

DCRs

DCR#1 – Create Bot related MVC/ Web API Controllers

Based on C# - samples 30

Note: Suggested names, are not final and open for discussion

Note: open question whether we need to support both MVC and Web API stack. Therefore, there might be another abstraction layer for some of the core functionality.

Class hierarchy and functionality

An abstract class BotController a subclass of MVC Controller providing basic inbound Activity processing

An abstract class BotActivityController a subclass of BotControllerBase providing specific (all?) relevant inbound Activity types (aka Activity Framework message types / events)

A concrete class BotDialogController that subclass BotActivityController. Can pass to it any dialog implementation (\<Dialog>) and store (either IStore or \<Store>). The BotDialogController combines storage (DCR#3) with the Dialog Run method (CDR#2) to further establish a clear separation between executing logic (changing state) and managing state (read/write to state store) This class is combination of C# samples#42 (state) with C# samples#30 (MVC):

DCR#1:

DCR#2 – create a Run method that takes care of all driver code related to running a dialog.

Based on sample #42 The goal of this DCR is to reduce complexity and concept count that is currently required from developers to fully understand the dialog stack and how to use dialog in the SDK. This DCR suggests a Run function that encapsulates everything you need to run a Dialog stack. For example. Combining this with the suggested MVC DCR will enable a BotDialogController that handles BF protocol activates, separate business logic from state and seamlessly enables dialog integration.

All that a developer needs to do is create his dialog implementation, example waterfall- and pass it to the BotDialogController , the rest is taken care by the changes in DCR#1 and DCR#2.

DCR#3 – State

This DCR suggest separating between executing logic (manipulating state) to reading and saving state from store. Such separation is required when a bot needs to scale-out as well as handling when the bot needs to handle multiple and rapid activities/ events.

The bot functionality will look like this:

Task<(JObject newState, IEnumerable<Activity> outbound, object result)> ProcessAsync(JObject oldState, Activity inbound);

Note the typing – objects that leave memory are network typed, either JObject or cosmetically Activity – otherwise it’s object – because the result is simply a handback to the bot in-memory parent.

An example of this, as part of sample #42.

This change composes with the current Dialog stack, such that:

  1. A Dialog stack can be encapsulated with this function.
  2. And, by implication, a Dialog can call a nested Dialog that looks like this function.

A developer is free to use the DialogSet/PropertyAccessor/DialogContext/Dialog/Prompt/Waterfall abstraction to implement this function. But they don’t have to. For example, the CDL project uses a simply implementing of this function directly. QnA team is going to create a “QnADialog” that follows similar pattern.

There is an abstract class that glues this function seamlessly into the DialogSet/* world (being an OO model it’s an abstract class and you override a function that has this signature.)

This will be combined with DCR#1, into the BotDilaogController.

DCRs Blast radius

All these DCR are additive to the SDK and do not introduce any breaking changes to existing functionality.

Generators

Docs and samples

Tracking Status

Dotnet SDK

Javascript SDK

Samples

Docs

Tools

[dcr] [enhancement]

cleemullins commented 5 years ago

Some additional key points here Re: MVC

There are several upsides to the MVC pattern that don't seem to be described in the DCR.

  1. No need for developers to touch the startup.cs (and related) classes. The stock out-of-box implementations will generally be sufficient for samples, templates, and most bots.

  2. Multiple Endpoints. Adding multiple endpoints ("/bot1/', "/bot2") is easy in the MVC world, as each controller does this very naturally. In the system we have today this is difficult as the ".AddBot<>()" is generally viewed as a singleton.

  3. Easier Access to the HTTP primitives. By working in the Controller space directly, the HTTP primitives (headers, tokens, etc) are much more easily examined and manipulated. In the IBot code we have today, bot developers are largley (by design) isolated from the HTTP stack.

  4. Multi-tenant bots. With access to endpoints and HTTP primitivies, authoring of multi-tenant bots becomes much easier. This means complex bots that work with Slack, Teams, and other multi-tenant services are become much more practical.

  5. Easier to tool. The MVC pattern of "Right-Click, Add Controller" should be fairly straightforward to use for bots via a Visual Studio Extension.

  6. Debugging. This is a big one, but having the "obvious" place to put a breakpoint for both inbound and outbound data makes things much easier to debug.

I'm a big fan of the custom controller. This is (mostly) what the V3 SDK does, and what we originally had in the very early versions of the V4 C# SDK.

I don't think we need the WebAPI equivalent.

cleemullins commented 5 years ago

Some additional points, re: State

The pattern of "State in / Process / State Out" is one of the most accepted patterns in computer science. This is a pattern that scales - both horizontally and vertically. Importantly, this allows state to be processed in such a way to avoid race conditions, corner cases, and scenarios where a user is told "Done" but the actual state save failed.

As I understand, this DCR gives us the ability to:

  1. Rationally retry Saves if they fail the first time.
  2. Not send "it worked" to the user unless state actually saved.
  3. Encapsulate all of the Dialog system in a simple Input/Output function, which means we can start down the state machine path and enable countless other scenarios.

Additionally this is a necessary component to bots that horizontally scale.

yochay commented 5 years ago

addressing feedback from @cleemullins

  1. True to some degree as developers will still need to define (control) storage, middleware, generic error handling, etc. However, the out of the box getting started is much (much) simpler and easy to extend as needed.
  2. both points 2 and 4 are similar and are in the DCR 1. With that said, I will make it clearer

Agree with everything else.

As for Web API vs. MVC, we should consider this one:

drub0y commented 5 years ago

This seems awfully overloaded for a single issue. Wouldn't it be easier if each were their own issue where the details can be hashed out and discussed in isolation from one another? For example, if I start arguing against MVC here we could have a back and forth of like who knows how many comments and each of them would be sufficiently long in their own right. Imagine the interleaving of comments on each of the proposed changes???

drub0y commented 5 years ago

MVC Controllers for Bots Feedback

I see a move towards adopting an MVC controller base class approach as a way to build bots as having the following issues:

Bots != HTTP

The implementation of a bot should ever be coupled to HTTP. It runs counter to the adapter model. One of the SDK's primary goals should be shield the bot developer from protocols, otherwise the abstraction will become leaky.

Adopting MVC controllers as the model for bot development tie those bots to the HTTP protocol.

For example, if the MVC controller approach is adopted and the Bot Framework Service releases support for a WebSockets based protocol, what would be the migration path for bots based on the MVC controller and coupled to HTTP?

Another example, "in-process" channels are being considered. If one of the in-process channel adapters uses a different protocol or even just a different serialization format for its data, what would the migration path be for bots based on MVC controller and coupled to HTTP.

Counter arguments

The current integration layer in the BF SDK works well, but has certain limitations (for example designed to support a single bot per app) and is somewhat complex and relatively new (compared to MVC / WebAPI). - @yochay/DCR

Can you provide details/specifics re: how the current implementation is "somewhat complex" so we can discuss?

The goal of the current implementation is to be idiomatic to what the [ASP].NET Core community expects by using patterns that community already knows and understands. More details below.

  1. No need for developers to touch the startup.cs (and related) classes. The stock out-of-box implementations will generally be sufficient for samples, templates, and most bots. - @cleemullins

[ASP].NET Core developers are used to having to "touch" the Startup.cs; some even choose to work in Program.cs with the HostBuilder APIs directly. Can you provide more details/specifics on your concerns?

Next, if we are supplying samples/templates, what's the difference between the Startup.cs file containing AddBot<TBot> and UseBotFramework vs AddMvc and UseMvc? The developer doesn't even have to think about it until they're ready. They simply run what comes with the template and then work their way back through it when they're ready, just like they don with [ASP].NET Core today.

It's idiomatic to the platform

The APIs in the Bot Builder .NET Core integration package are designed to be idiomatic with the other APIs a .NET Core developer works with. Specifically the AddXXX/UseXXX patterns are 200 level patterns that .NET Core developers learn as soon as they move beyond "hello world". Even MVC itself requires an understanding of having to call AddMvc and UseMvc for your controllers to light up.

Consider several examples of the APIs that with the default ASP.NET Core App meta-package:

Functionality API Set
Serving Static Files UseDefaultFiles
UseStaticFiles
Exception Handling UseDeveloperExceptionPage
UseExceptionHandler
CORS Add/UseCors
Authentication Add/UseAuthentication
Specializations of UseAuthentication...
UseMicrosoftAuthentication
UseGoogleAuthentication
* ...and more...
Identity Add/UseIdentity
Entity Framework AddEntityFrameworkSqlServer
AddEntityFrameworkSqlLite
AddDbContext
AddDbContextPool
Application Insights AddApplicationInsightsTelemetry
UseApplicationInsightsExceptionTelemetry
UseApplicationInsightsRequestTelemetry

These are just a few of the APIs that are available when you reference the ASP.NET app meta-package.

SDKs that target the .NET Core platform should be following these same patterns so that they "feel" natural and are easily discoverable to those who understand that platform’s patterns.

It's where the ecosystem is going

For example, other SDKs are adopting idiomatic .NET Core hosting to reduce friction in adoption, such as the Azure WebJobs/Functions SDK and the Orleans SDK for .NET. They have made the move from a proprietary hosting architecture to simply utilizing .NET Core's Generic Host.

Additional overhead of MVC

Leveraging MVC increases the runtime surface area drastically. This has performance and security implications.

  1. Multiple Endpoints. Adding multiple endpoints ("/bot1/', "/bot2") is easy in the MVC world, as each controller does this very naturally. In the system we have today this is difficult as the ".AddBot<>()" is generally viewed as a singleton.

First, can you provide more details on how the AddBot<TBot> approach blocks adding support for multiple bots? The original implementation of the integration supported multiple bots, but it was removed because it was seen as unnecessary.

It would take a medium level of effort to support multiple bots while supporting backwards complexity.

Regarding controlling the endpoints, I agree MVC will automatically discover all controllers and expose them using a routing convention vs having to actually introduce each bot explicitly with AddBot<TBot>. This is a fair observation, but, again, if this is desirable functionality it's totally possible to implement. We would need to define the default conventions for, for example, when multiple bots are registered without an explicit path configuration, but that's just a discussion to have around what the default behavior should be. For example, in the original implementation that supported multiple bots, the bots were exposed at /bots/<BotClassName> endpoints by default which mimics how MVC controllers are exposed that don't explicitly define a route (e.g. /api/<ControllerName>).

  1. Easier Access to the HTTP primitives. By working in the Controller space directly, the HTTP primitives (headers, tokens, etc) are much more easily examined and manipulated. In the IBot code we have today, bot developers are largley (by design) isolated from the HTTP stack. - @cleemullins

Can you provide more specifics/details re: why a bot developer would need access to HTTP primitives? Again, hosting a bot on a channel that uses WebSockets or any other transport would require a rewrite.

If a bot developers needs this, they could take IHttpContextAccessor as a dependency to their IBot constructor and have full access today without the need for a controller based approach. This is a familiar pattern for anyone who works with the ASP.NET Core platform.

This would be a decision on the developer's part as it would tie their implementation to that particular transport/integration/hosting model which has many implications, +but nothing about the existing implementation blocks this_.

  1. Multi-tenant bots. With access to endpoints and HTTP primitivies, authoring of multi-tenant bots becomes much easier. This means complex bots that work with Slack, Teams, and other multi-tenant services are become much more practical. - @cleemullins

Can you provide more details/specifics here? Are you making the case for multiple "in-process" channels/adapters over the same bot implementation instead of proxying those all through the Bot Framework Service?

If that's the case, nothing about the existing integration blocks this. While the current implementation is focused on only exposing things via the Bot Framework Service, there's nothing preventing expanding on support to other adapter implementations.

  1. Easier to tool. The MVC pattern of "Right-Click, Add Controller" should be fairly straightforward to use for bots via a Visual Studio Extension. - @cleemullins

How does this differ from supporting "Right Click, Add Class" that implements the IBot interface? Could the same benefit be achieved with tooling via the VSIX to add a similar menu item for "Right-Click, Add Bot". If people are starting from a template they'll have their first bot scaffolded out for them any way.

We should also keep in mind that not everyone is going to be using Visual Studio; VS Code seems to be more and more prevalent. VS integration needs to be weighed against that.

  1. Debugging. This is a big one, but having the "obvious" place to put a breakpoint for both inbound and outbound data makes things much easier to debug. - @cleemullins

Can you provide more details/specifics? A developer can put a breakpoint in the OnTurnAsync implementation of their IBot class and have the exact same capability as putting a breakpoint in the controller approach where you would just be overriding OnTurnAsync.

Testability Argument

From the MVC controller base class approach sample mentioned above, how will those bots be unit tested? Unit testing needs to be simple without any proprietary testing infrastructure; developers have a set of favorite testing libraries they are already familiar with and shouldn't have to learn something new.

Testing an IBot based Bot

You mock ITurnContext and you set up the Activity property with a valid Activity instance, instance your IBot class, call OnTurnAsync with the mock and you're done. You can validate calls to SendActivityAsync et. al. via your mock, etc.

Testing a BotControllerBase based Bot

In the sample referred to the BotControllerBase::OnTurnAsync is protected, meaning you can't just invoke that from your unit tests. Instead you would have to call BotControllerBase::PostAsync. That method does not use model binding to deserialize the incoming Activity, instead it does all the reading off the JSON directly from the stream via the Request property.

Similarly PostAsync does not utilize the IActionResult<T> pattern so it's hard to validate the result of the call to that method. For example, the result of an Invoke activity would be difficult to test because the PostAsync serializes it to the Response::Body stream.

These things mean that to even test your OnTurnAsync you would have to fake the Request property such that it contained a UTF encoded stream of JSON data. You would also have to fake the Response property with at the very least a NullStream for the Body.

This seems quite untenable. The .NET community has long since migrated away from this approach even for "actual" controllers starting in the WebAPI 2.x timeframe. Controller actions moved more towards being "plain old methods" with the runtime wrapping them which made them completely testable without having to mock the entire runtime because you can simply invoke them like any other method. To accomplish this we rely on things like model binding and returning IActionResult<T> so that the controller actions are abstracted as much as possible from the runtime.

Finally, how would you validate what activities are sent via the ITurnContext/BotAdapter? The creation of the BotAdapter has been rolled into being a responsibility of the BotControllerBase via the CreateAdapter method (which is also marked protected), so there's no perceivale way to mock/fake that from a unit test. It seems like it would require subclassing the actual bot controller class (e.g. MyBotController) to override the CreateAdapter call for the purposes of testing which, aside from being an abnormal practice, seems to indicate a broader issue with the design regarding SRP.

Stevenic commented 5 years ago

I've added a PR which adds the simplified driver code for the JavaScript side of things.

sgellock commented 5 years ago

@yochay moving to 4.4 as I believe the 4.3 work is being tracked by other issues

sgellock commented 5 years ago

@yochay a number of these are done. We can consider the "memory" DCR but that would be a 4.5 consideration. I'm going to close this issue. If you want to call out the memory DCR specifically, can you open a new issue?