Closed jpeach closed 4 years ago
xref #1155
Relatively fat helper library infrastructure driven by Go tests.
Ad-hoc testing. That is, no structured way to write tests. Tests are driven from the Kubernetes API.
No support for capturing debug information.
Tests initiated from shell script ./test/e22/run.sh, which deploys to kind and starts testing:
Test runner is build/run-e2e-suite.sh. This runs the "e2e" task from the "nginx-ingress-controller:e2e" container image with "kubectl run".
Uses kustomize to configure deployment types.
Ultimately, all this scaffolding runs a stand-alone Go binary that contains a Ginkgo test suite (builds with "ginkgo build").
Terrible operator UX, inherited from ginkgo. No way to list test or explore what the test suite will do. Running in a bad dev environment just pukes unreadable failure messages everywhere.
Need to enable Docker experimental features for the "buildx" subcommand.
Unable to get test environment tooling to work on MacOS. Why does the e2e run script deploy to kind but the "dev-env" make target use minikube? WTF.
Tests are written in Ginkgo, with a library of local helpers. The
framework.Framework
struct contains common APIs, helpers and
test expectations. Also gathers diagnostics on test failure.
Test framework hides boilerplate deployments for echo server, GRPC server and other useful paraphenalia.
Test checks include scraping config from inside the NGiNX pods, which seems pretty dubious, but perhaps required for the pass-thru config approach.
Ginkgo lets test be relatively cleanly formed and have a predictable structure.
Tests for ingress, knative and Gloo gateway scenarios in test/kube2e.
Very small number of tests, consisting of some shell scripts wrapped around ginkgo test code.
No significant diagnostics.
https://github.com/3scale/kourier
https://github.com/datawire/ambassador
Uses KAT (Kubernetes Acceptance Test) framework. Written in Python and driven by py.test.
https://github.com/datawire/ambassador/blob/master/docs/kat/tutorial.rs), https://docs.pytest.org/en/latest/
Test cases consist of the following components:
Initialisation: Tests are Python subclasses so they can set themselves up in the constructor.
Manifests: A chunk of YAML to apply to the cluster. The test library has a set of common default YAML constants for tests to use. Manifests are python string templates that are expanded at application time by the test.
Config: The config method also emits a chunk of YAML, but it's purpose is to configure a deployed Ambassador.
Requirements: A list of Kubernetes resources that need to be ready before the actual test can start.
Queries: This method returns a list of HTTP Query objects. The harness will perform the specified HTTP request and track the results which are available to the check. Since queries are object, they can specify arbitrary parameters (TLS, SNI, URL, timeout, etc.). Queries can be grouped in phases and with a harness delay between.
Checks: The check method is used to run arbitrary unstructured tests against the test state. Typically using the Python "assert" keyword. Checks run after queries.
In most cases, the test methods return generators. Maybe this is just conventionally Pythonic, but it is an interesting approach to keeping the test driver simple while giving the test flexibility.
Tests can be parameterized and composed (by aggregation). So, given tests named A and B, it is possible to compose a new test, C, consisting of A(1), A(2), B(3).
Uses httpbin as a backend Service. There are additional kat-client and kat-server commands, which are packaged into container images but are not used in tests AFAICT. Interesting that kat-server serves both HTTP and GRPC.
Developer instructions are not especially clear and user experience is weak. Without some Unix build systems and Python experience it will be very hard to run the tests.
$ make pytest DEV_KUBECONFIG=/Users/jpeach/.kube/config DEV_REGISTRY=docker.io/jpeach
https://gist.github.com/jpeach/fd53248a9b76cbf54fcac7b655975542
Poor user experience for debugging test failures. You get a Python RuntimeError exception on kubectl exiting with non-zero status and get to pick up the pieces.
Since pytest is the test driver, the project assumes a lot pytest user knowledge, which is a hurdle.
Most relevant example for testing core Kubernetes API extension points.
https://kubernetes-csi.github.io/docs/functional-testing.html
Probably better to think of this as running "checks" rather than "tests", but I'll forget and use the terms interchangeably.
Tests are grouped and uniquely numbered (in the spec). Seems pretty helpful to have a unique ID for tests. Could be used to link testable statements from the docs.
kube-bench has to run on the host it is checking (i.e. on a master or node host). It doesn't embed the check config, which needs to be distributed along with the binary.
Skip checks by editing the YAML definition. Seems oriented towards people forking the repo and committing site changes.
The check config YAML is unmarshalled to an internal controls.Controls
type, which is an uncomfortable agglomeration of data format and API.
Actual checks are defined in YAML. The check itself is a shell command that is specified in the "audit" parameter. The output of this shell command is fed into the subsequent tests, which are a series of string matchers defined in YAML. It is actually surprisingly clunky, though the YAML is relatively readable.
The whole policies suite is marked "manual", because there's no real capability to inspect Kubernetes APi directly. Also some of the policies aren't testable (e.g. minimize access to foo).
There are many harnesses that run end-to-end tests against Kubernetes clusters. This note collects my thoughts about how a Kubernetes test harness should work.
Some projects write tests directly in Go code. The tests are driven by the Go test runner and rely on a suite of internal helper APIs to reduce the amount of boilerplate code. There are three primary problems with this approach:
Problem (1) reduces the audience of people who are likely to build tests. Problem (2) increases the barriers to entry further, since contributors have to learn bespoke internal APIs to make progress.
A better approach is to express the test in a declarative DSL or data format. A special-purpose tool should execute the test and deliver results. The separation of the tool from the tests allows multiple projects to develop test suites independently (obviously this assumes the tool has good compatibility standards).
It is common for open-coded tests to just run actions and perform checks with no formal separation between stages. This results in an open-ended debugging process since it is not possible to stop the test at a desired point, nor to easily add instrumentation or additional checks.
Instead, if the test is expressed as a sequence of steps, the harness can pause or stop running the test at any step. Steps can be executed at an arbitrary rate, or reordered as runtime (subject to data dependencies).
Test steps can either be actions (applying an observable change to the cluster) or checks (verifying an expected state in the cluster). Steps can easily be reported to a variety of outputs (test log, CLI, web UI) so the operator can observe status.
It is typical to debug Kubernetes end-to-end tests by hacking the test code and supporting APIs to log additional state and progress. The Go test runner has particularly weak support for logging (usually no logs are emitted at all until the test has completed with a failure).
A test framework should be able to inspect the state of tests enough that it can capture and emit information that can help developers triage test failures. This information might include the state of Kubernetes API objects, logs from important pods, HTTP requests and responses, the outcome of specific checks, and so on. Information capture is much easier when tests are structured as sequences of steps since the steps create natural capture boundaries.
There are many kinds of observability. The questions that a test harness really needs to be able to answer are around "what is happinging now" and "what went wrong". Usually, the harness is executing either and action or a step, and this status can be reported to the user. If a step fails, this is where the harness needs visibility into checks and actions so that it can generate enough information to illuminate the failure.
Some kinds of test runners have little insight into what the test is doing, e.g. observing the exit status of a child process. That is not sufficient for our purpose here. If the harness cannot observe what a test step is doing, then is it hobbled when it needs to collect debug information. So the requirement here is that the test harness should deeply understand the actions taken at each step.
Test actions are steps that are intended to alter the state of the Kubernetes cluster. The most direct alterations can be made by using the Kubernetes API, but we can imagine actions that operate on the underlying infrastructure (e.g. kill a machine) or operate on external state (e.g. create a target for an externalName service).
Focusing on the Kubernetes API, the harness should be able to perform the following actions:
The obvious way to declaratively express creating a Kubernetes object is with a chunk of YAML. There are various approaches (see kapp, kustomize, kubectl) to applying YAML and checking for status.
Test actions can be expected to either succeed or fail. Successfully applying YAML is the normal case, but it is reasonable to expect failure so that boundary conditions can be tested. Failures may be a direct API server response (e.g. validation failure) or a subsequent failure that is externally observable.
To test the result of a Kubernetes API action, tooling needs to understand something about the type to be able to know status of a Kubernetes object. This means that status detection needs to be built as a library that knows (in principle) about all the types under test. The kustomize kstatus library may be a good start, and we may be able to develop common rules for knows API groups (e.g. anything knative).
Since YAML can be verbose, the test suite could support a library of predefined objects. This, unfortunately, implies that object names need to be uniquified and then propagated to subsequent object references. This risks wading into the swamp of Kuberneted YAML-wrangling tools.
There are a number of ways to update existing objects. In many cases, a Kubernetes strategic merge patch is enough to express the object update. However, as kustomize also supporting RFC 6902 JSON patches shows, strategic merges don't support all useful types of updates.
There's no existing YAML syntax to delete objects.
In the most general case, checks are arbitrary tests executed against the running cluster. Since checks are arbitrary, they could be just raw Go code, but we can make them more declarative by using the Rego language. This is a declarative syntax that allows the test harness to provide built-in functions and data. There are already a number of tools that apply Rego to Kubernetes objects.
For Ingress controllers, it is essential that checks are able to be expressed as HTTP requests. This could be implicit (as part of the Rego execution environment) or explicit (i.e. a declarative HTTP request). HTTP requests also need to be expressible as sequences so that tests such as "service F receives 20% of requests" can be implemented.
All the systems involved in a Kubernetes cluster are eventually consistent, so the checks need to be resilient to changes in timing. For example, a check that probes for a certain HTTP response may initially fail because the underlying service is not yet ready. Testing the status of a Kubernetes object will fail immediately after an action, but the check should eventually converge to success. The test harness needs to be cognizant of this and implicitly retry the checks with a time bound.
In some cases, there may be deterministic conditions that can be tested after applying an action. In these cases, we can synchronize on the condition before applying the checks. Synchronizing on a condition could be implicit, or it could be expressed as a check itself.
In other cases, we are testing some delayed or emergent effect of an action. We need to be able to write a check that will succeed but is tolerant of some initial failure. For example, a HTTP request to service A succeeds within some timeout. These checks need to be careful of false positives where is is possible for the check to run before the action has been processed.
To be able to write tests in a generic way, the harness needs to be able to inject various kinds of test context. For example, a unique test ID that can be used to generate HTTP requests. This mixture of static and dynamic metadata could be used in Go templating, or directly injected into runtime evaluations of Rego expressions or HTTP requests.
The test harnesss should annotate any Kubernetes objects that it creates with a standard set of metadata. At minimum, we need to know that the object was created by the harness. The specific test and test run may also be useful metadata.
Any standard objects that are created as side-effects of the test harness also need to be labeled. This means that the harness should recurse into pod spec templates and inject test annotations.
There are a number of possible uses for Kubernetes metadata:
- clean up state after test runs
- examine objects for test triage
- use as input for checks
This test that ensures that an HTTP service is resiliant to the termination of its underlying pods.
Action
- Deploy Service A with 2 pods (round robin load balancing)
- Deploy a HTTPProxy targeting Service A
Check
- HTTP response from pod A.1
- HTTP response from pod A.2
Action
- Kill a Service pod
Check
- HTTP response from pod A.1 only
Action
- Wait for 2nd pod to reschedule
Check
- HTTP response from pod A.1
- HTTP response from pod A.3
This test ensures that traffic weighting works as expected.
Action
- Deploy Service A
- Deploy Service B
- Deploy a HTTPPoxy targeting A for 80% and B for 20%
Check
- Run 100 HTTP requests
- Verify weighting from responses
kstatus (detect Kubernetes resource status)
https://github.com/kubernetes-sigs/kustomize/tree/master/kstatus
apply resources generically:
https://github.com/kubernetes/kubectl/tree/master/pkg/cmd/apply Kapp likely has something
test result formats (junit, etc)
test context
https://github.com/nirmata/kyverno/blob/master/documentation/writing-policies-variables.md
Closing, since there's no associated action here.
This issue captures notes, requirements and proposals about an integration test harness.