Open JoeTice opened 2 years ago
PR - https://github.com/department-of-veterans-affairs/vets-website/pull/22109
We created a new Big Query table called vets_website_e2e_allow_list
and populated it with all Cypress .spec
files:
We created a new vets-website
Github Workflow called .github/workflows/e2e-stress-test.yml
that currently:
allow=FALSE
If we pass an optional env
variable - SHOULD_STRESS_TEST=TRUE
- into run-cypress-tests.js
it runs the tests n
times. We updated the timeouts in GitHub Actions to 20 hours and set the job to run the Cypress tests 40 times. We're testing this here - https://github.com/department-of-veterans-affairs/vets-website/actions/runs/3002697203
Todo:
#vfs-all-teams
.Yesterday, we discovered a bug with the work we had done the day prior. Our ternary operator we were using to determine our number of runs to run in the loop was actually not working. We didn't discover this error prior, as none of the runs finished due to timeouts. The reality of it, was it was running in an infinite loop. We figured out a way to work around this and have corrected the issue.
Additionally yesterday we moved on to starting the functionality for determining what tests have passed and which ones have failed in preparation to update their enabled/disabled status in BigQuery. We created a new script in the dashboard data repo, and using the artifacts generated from the vets-website run (whether it's CI OR the new workflow), we were able to make one combined results JSON file. We then worked at a solution to parse these results, however we spent a good bit of time stuck at missing 12 out of the 304 tests that were not accounted for in our parsed results. Later, Holden was able to figure out the hiccup and we ended up with a much more reliable solution.
Our to-do list looks the same as the day prior's update, as the items we ended up addressing yesterday were unforeseen.
Curt and I have working, sans notifications.
On Monday we plan on tackling some/all of the remaining items:
Then there's the idea of listing flaky tests on a page in the platform docs; list the spec name, the titles of the tests that failed in the spec, and how long it has been disabled by the Allow List.
We'd like to add a field to the product directory regarding flaky tests. Talk to Joe and Peter about it.
One outstanding question: what to do if the long for test (the test that takes significantly longer than the rest) needs to get stress-tested in CI? It could take a very long time for that to run. If that test is detected, we run it in multiple cypress instances. Currently, we're run tests that need to be stress-tested in 1 cypress instances.
TBD --
va.gov-team
addedI created a support request with operations re the VA_VSP_BOT_GITHUB_TOKEN token - https://dsva.slack.com/archives/CBU0KDSB1/p1663599892158159
va.gov-team issues are created now
We'll do manual testing on this feature tomorrow.
Notes from Peter:
Peter Hill 11:25 AM Made test case stubs for the following scenarios -- VFS Team Creates an E2E Test Spec for a New Product VFS Team Adds an E2E Test Spec to an Existing Product VFS Team Adds a Test to an Existing Test Spec VFS Team Removes a Test from an Existing Test Spec VFS Team Removes a Test Spec for an Existing Product VFS Team Changes a Test in an Existing Spec
I've rerun the E2E Stress Test workflow multiple times (looping tests 2x) and haven't seen that BigQuery error again - https://github.com/department-of-veterans-affairs/vets-website/actions/runs/3091114869
Here's the PR that 'fixed' it - https://github.com/department-of-veterans-affairs/qa-standards-dashboard-data/pull/177
The reason for the failures we were seeing: Passing an async function as a callback to #forEach in JavaScript doesn't work as expected.
List of manual testing cases - https://docs.google.com/spreadsheets/d/1unNnzRcbY1AkMAZLuiO46KeiIJdBngGO-Vdcb9z0ZMU/edit#gid=0
Just pushed a commit to log everything for manual testing - https://github.com/department-of-veterans-affairs/vets-website/pull/22109/commits/fdbaa48dbdc0781fe1dc3ee00800979842c91959
Run - https://github.com/department-of-veterans-affairs/vets-website/actions/runs/3093299194
Manual testing is complete. Phew!
Issue Description
Given the information researched in Explore a GHA workflow to vet new/changed Cypress tests for flakiness, create a proof of concept of the proposed solution.
Tasks
[x] Use the proposed solution in this ticket to create a proof of concept
[x] Test the proof of concept for desired behaviors
[x] Report results of POC
[x] Discuss the results and POC with the team
[x] Create script that will run our test suite X times and accept an optional param of a single spec to run.
[x] Create a BQ table to store our spec list with disallow flags
[x] Create a workflow to run the full suite using this script on a schedule, adjust workflow timeout to allow for it to run as necessary
[x] Update BQ table every time this script runs with the appropriate disallow flags
[x] Alter test selection to check against the disallow list
[x] Create differ to check PRs for changes to banned specs, trigger re-check on that spec only using existing script
[x] Create mechanism to notify teams that a test has been quarantined and needs fixing.
[x] Perform manual acceptance testing
Acceptance Criteria