ppitonak commented 6 years ago

Status quo

Every commit to this repository starts this Jenkins job which builds Docker image and publishes it to the Docker registry
after successful build, multiple Jenkins jobs run (e.g. smoke test, login, on two clusters), see downstream jobs
additionally, multiple Jenkins jobs testing various OSIO components run sequentially, last one resets the environment
there is similar setup for both prod and prod-preview
some parts (e.g. login or creating a new project) are tested multiple times
there are multiple boosters/quickstarts (4 at the moment), which can be configured to accomplish one of pre-defined missions (5 at the moment) and the generated project can be configured to run with one of three types of pipeline, there will be probably more versions of every quickstart (community/productized bits, major versions of Vert.x/Node.js etc.)
1. any of these parameters might (and probably will) change
2. we need to test all boosters in prod, ideally with every component (launcher, build/pipeline, analytics, che) which is not possible to do with e2e tests
service delivery team monitors the result of e2e test
1. they now only see pass/fail
2. they would like to break the results down into smaller parts so that we can identify problems with greater granularity (e.g. find out that login is slower but the rest of platform is ok)
3. we would like to have confidence that all parts of OSIO are behaving as expected

Proposed solution

leave Jenkins job building master as is
implement virtual pipeline for one booster which would test everything - create space, create project, analytics, verify that it runs in che, change code, promote build etc.
1. clean environment after run
2. run immediately after cleanup is done
3. create similar "pipelines" for other cluster(s), for prod-preview, for completely different flows (e.g. private repos, collaboration in team)
4. DON'T create similar "pipelines" for other quickstarts
aggregate the results of "pipeline" and send them to Zabbix together so that all data end up in one chart (e.g. time to login, time to create project, time to finished build, time to start che workspace, total time of whole workflow)

Advantages of proposed solution

login/create project would be tested only once
it would simulate user's workflow more closely
zabbix chart would contain useful data
successful run of the "pipeline" would mean that all components of OSIO are in good shape
the "pipeline" would run faster and as often as possible
we need fewer test accounts
quickstarts are properly tested according to test pyramid, not by e2e tests
probably all test logic can be re-used, it's a like a huge refactoring

ldimaggi commented 6 years ago

I think that the way to define and present this task is to first define our end goal. In other words, first define where we want to be, and the define the steps that we will take to reach that goal.

Our goal ought to be:

A single test that verifies the operation of critical OSIO components and their integrations following a non-destructive "happy path". The operations verified by this test must include creating/executing build pipelines, and creating/usein Che workspaces.
The test must run continuously and serve as a "health check" for prod-preview and production.
The test must run on all production clusters and verify consistent results across all clusters.
The test must run in less than 5 minutes so that it can be used to verify pull requests.
The test must be run for all pull requests before they are merged into production in order to ensure that critical OSIO components and their integrations are not adversely impacted by any changes made by the pull requests.
The test must provide a clean test environment for itself and must "clean up" before it terminates.
The test must provide fine-grained information (inc. Jenkins build logs) on any failures that it experiences.
The test must provide fine-grained throughput reporting on the actions the test performs (login, space creation, etc.)
The test must report automatically to Zabbix.
The test's implementation must be modular, require minimal updates in response to OSIO UI changes, well documented, and easy to extend.
The test must be robust and be able to run 100% of the time unless OSIO product bugs are encountered.

ppitonak commented 6 years ago

The test must run in less than 5 minutes so that it can be used to verify pull requests.

The test must be run for all pull requests before they are merged into production in order to ensure

The test must be run for all pull requests before they are merged into production in order to ensure that critical OSIO components and their integrations are not adversely impacted by any changes made by the pull requests.

IMHO this is extremely difficult/impossible to achieve and shouldn't be our goal at this point

The test must provide a clean test environment for itself and must "clean up" before it terminates.

Clean environment is (?) a prerequisite for tests, not it's goal.

The test must provide fine-grained information (inc. Jenkins build logs) on any failures that it experiences.

This statement looks like an ideal goal of every test in the world and is not specific to this refactoring.

ldimaggi commented 6 years ago

The test must be run for all pull requests before they are merged into production in order to ensure that critical OSIO components and their integrations are not adversely impacted by any changes made by the pull requests. IMHO this is extremely difficult/impossible to achieve and shouldn't be our goal at this point

This has been a goal since August 2017 - it would be great if we could finally make progress on this.

ldimaggi commented 6 years ago

The test must provide a clean test environment for itself and must "clean up" before it terminates. Clean environment is (?) a prerequisite for tests, not it's goal.

Agreed! Let's drop this item.

ldimaggi commented 6 years ago

The test must provide fine-grained information (inc. Jenkins build logs) on any failures that it experiences. This statement looks like an ideal goal of every test in the world and is not specific to this refactoring.

This item is actually a deliverable that multiple people have requested. Solving https://github.com/openshiftio/openshift.io/issues/1790 would help this a great deal!

ppitonak commented 6 years ago

The test must be run for all pull requests before they are merged into production in order to ensure that critical OSIO components and their integrations are not adversely impacted by any changes made by the pull requests.

IMHO this is extremely difficult/impossible to achieve and shouldn't be our goal at this point

This has been a goal since August 2017 - it would be great if we could finally make progress on this.

I disagree with this. E2E tests by their nature will never be fast and complex enough at the same time. If we want to run them for each pull request, we would slow down the progress on PR for couple of hours. Do we want to trade the speed of development for more stable production environment? IMHO this is a question for broader team.

The test must provide fine-grained information (inc. Jenkins build logs) on any failures that it experiences.

This statement looks like an ideal goal of every test in the world and is not specific to this refactoring.

This item is actually a deliverable that multiple people have requested. Solving openshiftio/openshift.io#1790 would help this a great deal!

I'm not saying that it is a bad goal or that it isn't a goal at all. I'm just saying that high-quality test report is an implicit expectation of every test so it doesn't need to be stated explicitly.

RickJWagner commented 6 years ago

As part of the re-architecture effort, please make an allowance to gather metrics at various points in the process. These metrics should be stored (for both passing and failing tests) in a location where they can be examined. A reasonable amount of historical data should be maintained. The purpose of this is to allow us to judge current performance of OSIO, especially compared to past days/weeks. This will be useful when we introduce changes to the environment and when we get support tickets complaining about performance. Thank you for considering this idea.

ppitonak commented 6 years ago

What we have today, a smoketest suite that

logs in
creates a space
creates a new project from booster
opens Che (very basic, no interaction with project itself)
interacts with pipeline (build, stage, promote, run)
verifies deployment (pods are running, it consumes some memory/CPU etc.)
verifies dashboard (analytics report etc.)

What is still missing:

[x] result still reported to the Zabbix as a binary value (#474 and #609)
[ ] some interaction with Che #736
[x] logout (open for discussion)
[ ] verification of the deployed application (in both stage and run environment, both initial state and after change in GitHub repo) #588
[x] resetting the environment after test (depends on #581)
[ ] testing the new/upcoming components (see #586)
[x] renaming quickstart_deployments.spec.ts to something better #543

ldimaggi commented 6 years ago

I think that the minimum that we can do with Che is to open a workspace and verify that the project has been populated into the workspace. Thx!

ppitonak commented 6 years ago

We talkedl with Che QE team today and we agreed that we would do a proof of concept for using their test suite in e2e Jenkins job. Jenkins job would look like this:

Protractor - log in, create space, create project, create Che workspace, interact with pipeline, verify deployment, verify dashboard
Maven - build the app, start the app, edit source file, commit & push
Protractor - interact with pipeline and deployments, verify dashboard

pros

we don't need to write the same test again
we don't need to create page objects in different language

cons

it's a new approach and we don't know what roadblocks we should expect

@ldimaggi @pmacik what part of this is already covered by your tests?

rhopp commented 6 years ago

Issue created on che-functional-tests side: https://github.com/redhat-developer/che-functional-tests/issues/211

ldimaggi commented 6 years ago

Question - we will be running the Che (Java) tests as the OSIO E2E tests? Is this the proposed plan?

ppitonak commented 6 years ago

@ldimaggi yes, that's the plan for PoC

ppitonak commented 6 years ago

Most of the work is done, additional extension of smoke test will be tracked in linked issues.

fabric8io / fabric8-test

Re-architecture end-to-end tests #494

Status quo

Proposed solution

Advantages of proposed solution