GoogleCloudPlatform / emblem

Emblem Giving is a sample application that demonstrates a serverless architecture with continuous delivery, and trouble recovery. :diamond_shape_with_a_dot_inside:
Apache License 2.0
239 stars 61 forks source link

Decide on website testing standards #175

Open ace-n opened 2 years ago

ace-n commented 2 years ago

Testing options

Unit testing

Unit testing would involve testing very small pieces of logic (< individual page view handlers) individually. All external dependencies (e.g. Flask components) would be mocked.

✅ Easy to implement ✅ Low false-fail risk ❌ High false-pass risk ❌ Low per-test code coverage (i.e. large numbers of tests required for comprehensive code coverage) ❌ Lots of tests to write and maintain

Integration testing

Integration testing would involve testing medium-sized pieces of logic (~ individual page view handlers) individually. Some external dependencies would be mocked, but only those that are not part of the website themselves (such as the Content API).

✅ Fewer tests to write and maintain (since each test covers a larger section of code) 💭 Medium-difficulty to implement 💭 Medium false-pass risk 💭 Medium false-fail risk ❌ Less granular upon failure

End-to-end testing

End-to-end testing would involve testing large pieces of logic (> individual page view handlers) including the Content API. Wherever possible, external dependencies would not be mocked.

✅ Low false-pass-risk ❌ Hardest to implement ❌ Not very granular upon failure ❌ High false-fail risk ❌ Can be difficult to maintain

Recommendations

Ace:

Overall, I think we should focus on having good integration test coverage. We may want to layer a few E2E tests on top of that (maybe via Cloud Build?), but those are arguably less important than barebones integration tests.

Charlie:

I agree for integration tests for the application. Libraries should have unit tests unless the amount of mocking involved would basically obscure potential failures.

Dina:

Seems to me like we should integration tests for PR's and nightly e2e tests. We'll want to maintain some confidence that the whole system works together on GCP as expected.

Next steps

This has been documented (internally) at go/emblem-website-testing-plan

engelke commented 2 years ago

I agree for integration tests for the application. Libraries, though, should have unit tests unless the amount of mocking involved would basically obscure potential failures.

dinagraves commented 2 years ago

image

Seems to me like we should integration tests for PR's and nightly e2e tests. We'll want to maintain some confidence that the whole system works together on GCP as expected.

ace-n commented 2 years ago

Current consensus

grayside commented 2 years ago

On Testing Types:

I prefer to see our testing be fast enough that anything that is clearly pass/fail can be run on every PR. Nightly or weekly for one of these scenarios:

Regarding GitHub Actions, this assumes we have a standing environment. Our test strategy needs to include both a standing environment and a from-scratch environment.

In addition to a decision record, testing expectations should be documented in https://github.com/GoogleCloudPlatform/emblem/blob/main/CONTRIBUTING.md and information for maintainers on using test results should go in https://github.com/GoogleCloudPlatform/emblem/blob/main/MAINTAINERS.md

There was some testing discussion in the original design doc, any decisions we make supercede that but it's possible there's something useful there.

ace-n commented 2 years ago

Unit tests prevent logical regressions ASAP. A good candidate for this would be any tricky logic, especially where it can cascade and break other tests in mysterious ways

I'm not aware of any such logic in the website (yet). If memory serves, much of it is "straightforward CRUD".

We should mostly write integration tests. Our bias in this app is that we're confirming all the Cloud services are working as expected.

Yep, this is the consensus.

We need a core set of end-to-end tests that can double as deployment smoke tests. (Prevent deployment/trigger canary revert, notify & suggest rollback if after canary).

Ditto - @dinagraves suggested this, and I think it's worth adding.

I prefer to see our testing be fast enough that anything that is clearly pass/fail can be run on every PR.

I agree with this for GitHub Actions-based integration tests. I don't know how I feel about this for E2E tests with Cloud Run itself.

Regarding GitHub Actions, this assumes we have a standing environment. Our test strategy needs to include both a standing environment and a from-scratch environment.

If we're running E2E tests in GitHub Actions, yes - but personally I think those should be relegated to nightly status (to reduce flakiness that might block a PR).

Then (or otherwise - if we decide to support E2E tests on GitHub), we can do a from-scratch environment with all of our nightly tests.

grayside commented 2 years ago
  1. E2E tests must be part of the rollout, so we'll be running them from Cloud Build at least on canary builds.
  2. If our E2E tests are flaky on anything except a rare circumstance, then our tests have bugs that need to be fixed. We can add retries and other behaviors within the test. Let's aim for best quality practices. We often have good PRs that lead to unstable deployments. That has been corrosive to our velocity.
ace-n commented 1 year ago

Added a link to go/emblem-website-testing-plan, where this info is documented.