Closed mdrichardson closed 6 months ago
Thanks for writing this up @mdrichardson .
Regarding when we change. I think this is case-by-case. The principle is: favor cleanly structured, clear regular mocha tests over clever-clever record and playback proprietary frameworks. A unit test suite should read as a (flat) set of asserts about the expected behavior.
And it would be great to have more overnight functional tests against the real systems. This is basically the approach we started with the Teams integration. I like the idea of additional triggers for functional test runs - for example if there is a change in the Azure package then run the functional tests for that PR - if there isn't don't - knowing we'll find anything in the overnight if there is an indirect problem.
I favor separate functional tests over the modal idea.
Its very important to embrace the underlying test framework and try to build as little in the way of additional mechanism as possible. For example, I opened an issue on the adaptive expression tests. These were not actually record and playback, but they still introduced an additional mechanism over the basic mocha structure. That is not good, we should always endeavor to use the underlying test framework in as straightforward a manner as possible.
@johnataylor
Regarding when we change. I think this is case-by-case.
Can you expand on what you mean by this? My "plan" was to figure out which libraries/features we want this to apply to, then do it in Cosmos, first. Once that one is implemented perfectly, implement it for everything else the same way.
I favor separate functional tests over the modal idea.
Makes sense. Let's say that we have some tests (like Cosmos) that currently has some functional tests already written, but that are either mocked or skipped in CI. Does it make more sense to:
/FunctionTests
where they're harder to find, but we avoid maintaining duplicate code?I definitely favor #2 for tests that are skipped in CI, anyway. I'm less sure about moving mocked tests from other libraries; I think maybe this is where the case-by-case basis comes in.
I'm happy to make this all happen, whichever way we go. Ideally, we decide the case-by-case items ahead of time so that all of the adjustments go in one at a time, but all within one cycle (vs. having to re-convene between every PR).
@johnataylor @cleemullins @mrivera-ms @stevengum
Do we have a direction we want to go with this? It's related to this issue which is tagged for R11, but I've been holding off on tackling it since any new tests will likely use some implementation of nock
.
Let's loop @joshgummersall in on this discussion. At the moment it seems like a spike/implementation will be done in botbuilder-js first, so he might have some valuable insight/pointers on JS/TS-specific implementations.
A valuable first pass might be setting something up for fetching tokens from AAD (for AppCredentials). There's only one HTTP request per AppCredentials subclass and it's a building block for the Conversations, Attachments and Token Service functional tests.
I'll defer to @johnataylor's and @joshgummersall's input, but option 2 or some variant of it sound reasonable to me.
As an aside, we'll have to update the code coverage reports/pipelines to include functional tests.
I think I'm definitely in favor of moving to mocks over nocks where we can as long as we have some backup functional testing.
I think my preference, in order, is:
sandbox.verify()
)This is all with the implicit understanding that we should have functional, end-to-end tests that exercise the real API endpoints to protect against drift.
@joshgummersall That all makes sense to me. It seems like the best order tackle this might be something like:
I think that all sounds reasonable. I'll have a small PR with some common testing/Sinon/sandboxing functionality in the next day or two so perhaps hold off on rewriting anything with Sinon for now.
Test Adjustments
This proposal covers two related issues:
Table of Contents
Current Situation
Record and Playback
Many of the functional tests in the JavaScript SDK rely on Record and Playback style of testing instead of the more traditional mocks. This is problematic because:
Functional Tests
Additionally, where we have mocks or Record and Playback tests (in all SDKs), there is potential the tests to become stale. I've fixed a couple of small issues where this has occurred with both Cosmos and Connector.
Proposed Solution
Converting the JavaScript tests from Record and Playback to mocks is "easy" enough and doesn't need much design guidance--just confirmation that it should be done.
Relevant Libraries and Features
However, we have the following libraries/features that have tests with mocks that could be added to nightly functional tests:
There are two ways that we can approach this:
AppId
,AccessKey
, etc), or[Option 1]: Programmatic
For the items that we want nightly functional tests for, we need to design the tests such that running them live can be easily enabled. This could be addressed with the following design (this discussion will be based around .NET, but will apply in JS/Python to the extent possible):
WillRunLive
parameter, which is lazily set by determining whether or not certain environment variables are present.AppId
oraccess key
.WillRunLive
is determined, there will be logger output that specifies which way the test is running (Live vs. Mocked vs. Skipped)WillRunLive
:Assert.Ignore()
Pros:
Cons:
[Option 2]: Separate Files
We could fairly easily break mocked tests out into separate files for nightly functional tests with mostly a copy/paste and then removal of mocks.
Pros:
Cons:
Triggering Functional Tests
Nightly
Yes.
PRs
We could optionally add automation to trigger live functional tests on the relevant feature/library when certain
GitHub
tags get added. We could have these be added, either when:Decisions to Make