Closed n3wscott closed 5 years ago
/kind proposal
It would be great if we can have a common framework for all the testing (e2e, conformance, perf, stress), but to me there are some other things we need to consider:
bash
script is used by Prow
to set up the test environment. In the script, we use ko apply -f config/...
to install all the dependencies. This is also the standard way for the user to install a new provisioner
or eventing-source
, as mentioned in the doc like README for GCP PubSub Channels. Having them in our bash
script is important since it can guarantee the installation step is not broken.And of course there are some steps shared by all testing types, for example:
So, will it be possible or better to decouple the testing framework into different component, and we can use whatever we need based on the actual demand? For example, we can have a TopologySetupController
and it can reconcile and create all the channels, brokers, services we provide. And we can also have ValidationService
to receive both actual and expected results and do validation for us.
And it seems this proposal is also related to the Serving perf testing that @srinivashegde86 is working on. Please comment below if you have any thoughts on this.
Few question that can be resolved orthogonal to the test infra discussion.
/milestone v0.7.0
related to (pre-req) #939
/close
@akashrv: Closing this issue.
Objective
Knative Eventing needs a reproducible way to perform performance testing. We would run these tests after releases and nightly builds to detect performance changes. The set of tests should be something anyone can run that has a kubernetes cluster. Ideally a solution could be used by any Knative project.
Non-goal: perf testing in minikube.
Non-goal: Test Eventing Sources (though not block doing this in the future).
Background
At the moment we have e2e tests that are fragile to run and understand their results. These e2e tests are produced by a combination of bash cluster setup and then a golang test script that is in-charge of resource creation, waiting, test running, and clean-up. The debug method assumes the consumer of the e2e test event will output the expected result to the pod logs and we are doing a simple grep off those logs. Debugging these tests is challenging.
Serving currently has some performance tests that are script based and produce load from a load generator and then traffic is directed at another cluster. This works well for serving because the usage model for serving is straightforward when compared to eventing. Serving will most likely be invoked from an external entity and thus measuring perf through the various ingress methods works. Eventing on the other hand will only really care about cluster-local traffic, most traffic that goes onto the eventing mesh will be bridged via a source app and then forward on an event to a Channel, Broker or Service.
Eventing provides challenges in performance test setup from the topology required for each test. For performance tests, perhaps there is a more developer friendly way to write and have these tests run; and a more test friendly way to retain test result history.
Requirements and Scale
Design Ideas
In looking at this problem space, there is a tool that is really good at creating replicable results in a cluster: Kubernetes. We should create a custom controller for Kubernetes designed to run perf tests. Then create a custom resource definition that holds the generic parameters we know we need. In this way we can just post the list of tests to a cluster, the controller will do what is needed for setup, the test, and teardown.
The perf CR will run as a job. The controller can observe the running jobs and allow only one to be reconciled and run at a time.
The CRD spec will contain the following:
These two images are separated to allow for the test to be decoupled from the setup. For example: a loopback test needs to only care that the events it sends will get delivered back to the running container. The same test could be used for various topologies: a single channel, many channels, broker, broker + channel, etc...
Once a Perf Job is run, then the status will be marked as complete with the result of the test written to the status. Testing artifacts will be uploaded to the various buckets and dashboards we require. The test could be written to write saveable results to a file and we could pair these with a pod that uploads our results.
To allow serving or build to also use this tool, an option could be to have two clusters, one with this controller installed, and another being targeted for the test.
Alternatives Considered
We could continue writing bash+golang scripts. The go script could be pushed into a single pod and run, then watched from the outside of the cluster. We could continue to grep the logs for known keys, one key could be perf results. I lean away from this because writing controllers is the tool that I use in knative, and it is interesting to me to be able to use that same tool to write tests.
Write pure go tests which interact with the cluster.
Use a tool like Sonobuoy to run the tests.
I researched load generators for eventing and was not able to find anything that would do what we want.
Related Work
knperf: I wrote a very quick POC that uses a controller to create a job and then watch the job until it is done. It does not have the two image idea. It does not upload results. It does not parse the logs.
https://github.com/GoogleCloudPlatform/distributed-load-testing-using-kubernetes (https://medium.com/google-cloud/google-kubernetes-engine-load-testing-and-auto-scaling-with-locust-ceefc088c5b3)
https://github.com/heptio/sonobuoy - looks like you can make your own plugin. But this is another form of re-inventing CRDs.