mitodl / mit-open

BSD 3-Clause "New" or "Revised" License
0 stars 2 forks source link

POCs for end-to-end testing frameworks #401

Closed jonkafton closed 7 months ago

jonkafton commented 8 months ago

Description/Context

We're looking at introducing an E2E testing framework, tooling for local test development and CI to run the tests.

The goal is establish which framework best suits our purposes by running POCs on popular options and assessing viability.

Rationale

The React app currently contains unit tests. These are great for quickly offering some confidence during CI, though do not describe the product feature set and often suffer from being overly simple or testing too small a unit, ie. unlikey to catch unintentional change to behavior. A passing unit test suite does not guarantee that the application works to the degree a comprehensive E2E/acceptance test suite might.

As unit tests live alongside their components they limit non functional code change (refactoring) as to change the component requires work on the unit test. We are more interested that the application functions and less so how it functions (black box testing). We should not export otherwise private functions for the unit tests as it is then not clear which are public to the app - the testable unit entrypoints are the exports.

The other “end” of the E2E test depends on the ease of deploying server side components in isolation in CI. The tests may depend on running components in a pre-live environment where this is not practical (e.g. Cloud native services). Tests will also often depend on specific data being present to produce the expected UI and should clean out any data added during the test run (unless covered by some retention policy). An option is for the tests to spin up an ephemeral database bootstrapped with any necessary data to both produce the initial state and avoid the extra overhead of cleaning up test data. We should also consider which tests run and when. For example we may want to run a subset of read only tests to sanity check new deployments to production environments.

The terms “E2E” and “acceptance” testing are often loosely defined though we can use E2E in that we are testing the browser app within the context of its supporting backend and integration points and acceptance in that we are functionally testing the application’s requirements driven feature set. We are writing functional tests for acceptance that run end to end. In placing backend services under test we encourage cross-functional feature based development.

Plan/Design

Popular frameworks include Cypress, Playwright and Testing Library.

Requirements:

Out of scope for POC but to keep in mind:

pdpinch commented 8 months ago

What will we have when this ticket is closed? A proposal? a decision? An implementation?

jonkafton commented 8 months ago

Aiming for us to have selected a test framework and in doing so have a working test suite with some initial assertions to build on. Output will include an evaluation summary to put back to the team. A subsequent ticket with cover the full implementation (baseline specs and CI).

ChristopherChudzicki commented 8 months ago

My 2c is: Either Cypress or Playwright would be fine. I like Playwright's APIs a lot better, but it gives you less control over the browser.

It seems to me that the much harder question is how to handle data so that e2e tests can be run locally and also against rc/prod. IMO, that's worth thinking about in this POC issue. One idea:

Playwright vs Cypress

Regarding Testing Library: This isn't really an e2e testing option. Testing library is a collection of testing tools packages (assertions, DOM queries) that provide similar, high-level interfaces to test frontend code. We currently use this in our frontend + mocked-backend unit/integration tests. See below.

Retries and Selectors: Both Cypress and Playwright can auto retry assertions and wait for selections to appear. Both support querying DOM via user-facing attributes (role, state, text). Playwright has these Testing Library-esque queries builtin, and for Cypress, similar functionality comes from @testing-library/cypress

Debugging: Both have --debug modes that show the browser. Playwright can, I believe, generate tests from the GUI, though I don't know how good they are. Cypress might have something similar. Being a MS project, Playwright also has very good integration with VS Code.

Cypress

I personally find the developer experience poor (at least... annoying) because Cypress tests are very asynchronous but you can't use async / await. Most cypress objects have .then methods, but they aren't real promises, so you can't await them.

Parallelization: It used to be true that in order to run tests in Parallel, you had to use Cypress's cloud CI service. I'm having trouble telling if this is still true.

Playwright

The developer experience seems very good to me.

Browser Support: I believe you get much less control over which browser version you use with Playwright than with Cypress. I don't think you can specify specific versions of browsers, at least not easily[^1]. Playwright ships bundled with its own browsers. In general, when you update Playwright, you change which browsers you test with (to newer versions). References:

This seems like an odd design to me, and I don't really understand the choice.

[^1]: I believe Playwright uses standard Chromium / Chrome, but patches Firefox and Webkit. So I think you can make it work with any specific version of Chromium above some number, but only with bundled Firefox/Webkit.

Existing Tests, Goals

We already use Testing Library[^2] in our JSDom-powered "unit" tests. Because Testing library's APIs delegate to React for render and are designed to interact with components similar to how a user would, I have pretty high confidence in our existing tests to check behavior and regressions.

[^2]: Specifically, we use: @testing-library/react, a thin wrapper around React's own test renderer. (Which I believe is roughly the same as React's real renderer, but throws errors more aggresively.) And two framework agnostic libraries, @testing-library/dom for querying the DOM and @testing-library/user-events for emulating user integrations in JSDom.

This:

The React app currently contains unit tests. These are great for quickly offering some confidence during CI, though do not describe the product feature set...

is reasonable characterization of Enzyme tests, but is IMO is much less true for unit/integration tests written with Testing Library. For example, this test, if written in Playwright, would look almost identical. (Even down to method names, since Playwright adopted some Testing Library methodology.)

Realistic Data: Additionally, although we mock our backend APIs in the frontend tests, we should be confident that the data is realistic since it is constructed to match our OpenAPI schema.

That said, there are certainly limitations to our current method:

  1. Because our tests are run with JSDOM (a node-based browser emulator) rather than a real browser, we can't fully test:
    • anything visual
    • anything layout-based, like infinite scroll or dragging
    • components like CKEditor, that use the contenteditable APIs which aren't supported in JSDom.
  2. We can't test on RC/Prod with real data, nor can we fully test things like the auth flow (create account, verify, login, etc).

The other virtue of e2e tests I see is that, being independent of the application, they should be very resistant to "artificial failures" from refactoring.

jonkafton commented 8 months ago

Existing Tests, Goals

We already use Testing Library2 in our JSDom-powered "unit" tests. Because Testing library's APIs delegate to React for render and are designed to interact with components similar to how a user would, I have pretty high confidence in our existing tests to check behavior and regressions.

Taking a step back then to think about the cost-benefit value of proposed pure E2E acceptance testing against our existing unit tests, given these go quite some way towards emulating a user's viewpoint and masking implementation.

The ideal for testing a user facing application is that user centric feature requirements are asserted by the tests and that the test report describes the product spec, tying the project requirements gathering through to delivery.

The cost to automate a human tester is typically higher than testing against source code so it's a reasonable concession to run unit tests where they give a good degree of confidence for a much smaller development cost and where they run quickly and give early and accurate feedback. Historically (Selenium, WebDriverIO) E2E tests have been prone to false negatives (brittle element selection, race conditions around load/render/assert), time consuming to write and debug and slow to run due to limitations in communicating with the browser (browser drivers, JSON Wire Protocol). These issues have largely been addressed with the newer generation of test runners. Playwright uses Chrome DevTools protocol for Chromium browsers (Chrome, MS Edge). Cypress instruments browsers at the application layer to run tests in the same process as the code being tested.

Our unit tests are hybrid in that they are framed in terms of the user for a large part they emulate the browser environment and user interactions and so tick boxes towards acceptance testing. @testing-library/react intentionally does not expose React component instances, props or state, so meets much of the criteria for not being bound to implementation detail. They do however lack the key benefits of E2E tests that @ChristopherChudzicki mentions above:

The question then, if we are to realize these benefits, do E2E tests supercede the current approach of unit testing or when should we write one or the other? I would say yes (assuming the unit test suite is repurposed towards testing any pure functions and heavier logic), with the conditions that:

mbertrand commented 8 months ago

One thing I'm curious about is how well the various E2E frameworks integrate with Github CI/CD. Seems like both Cypress and Playwright have plugins for doing so.

jonkafton commented 8 months ago

One thing I'm curious about is how well the various E2E frameworks integrate with Github CI/CD. Seems like both Cypress and Playwright have plugins for doing so.

Yes, both provide Docker images with browsers and system dependencies pre-installed (Cypress, Playwright), plus Cypress has a custom action referenced in your link above - I don't foresee any issues pointing the tests to hosted environments to validate deployments. I'll write up an issue to cover pre-deployment testing where we'll want to run the application locally to CI - additional challenges there such as bootstrapping test data and orchestrating the containers (e.g. to close with code coverage output).

jonkafton commented 7 months ago

We can wrap up our framework selection, firstly as we have a team preference for Playwright and it is in use in other OL projects. Additionally, it quickly emerges as a newer generation of E2E testing solutions, primarily having overcome a key pain point of Cypress - that all normal JavaScript commands, assignment and control flow logic must be mediated by Cypress to be visible to it. By mapping commands to an internal queue, Cypress cleverly bridges JavaScript control flow to element selection with seamless wait and retry. There’s convenience in an asynchronous command sequence being internally produced from simple object chaining syntax - the developer is relieved of any promise or callback handling, though the penalty is that the execution sequence is not idiomatic to the language and as a result can be unintuitive and unpredictable. To provide these capabilities, Cypress ships an architecture that runs tests directly in the browser. This approach was certainly a welcome improvement on the earlier Selenium WebDriver based solutions, though is somewhat a deal breaker relative to Playwright’s approach of natural JavaScript (it also supports Python, C#, Java) and native automation through DevTools protocols.

Testing Library is not a candidate as it’s not a full fledged testing framework and does not include a test runner. Instead it provides integrations for various test runners and client frameworks, providing methods for querying elements and making assertions. We are using it for unit testing React components against an emulated DOM. It provides a Cypress plugin that extends Cypress commands with selection methods that follow its guiding principles of isolating tests from implementation detail. Its author writes a good article aligned with our key aim of E2E testing that the system functions as opposed to how it functions. This involves testing according to how users and assistive technologies perceive the page rather than relying on selector paths or code hooks for tests to find elements.

Some observations:

cy.findByRole("link", { name: "MIT Open" })

Smaller:

jonkafton commented 7 months ago

Branches with setup and basic initial homepage test: Cypress: https://github.com/mitodl/mit-open/compare/jk/401-evaluate-e2e-cypress Playwright: https://github.com/mitodl/mit-open/compare/jk/401-evaluate-e2e-playwright

jonkafton commented 7 months ago

It seems to me that the much harder question is how to handle data so that e2e tests can be run locally and also against rc/prod. IMO, that's worth thinking about in this POC issue.

I've written up an issue here @ChristopherChudzicki that covers this, https://github.com/mitodl/mit-open/issues/418.