confidential-containers / cloud-api-adaptor

Ability to create Kata pods using cloud provider APIs aka the peer-pods approach
Apache License 2.0
48 stars 88 forks source link

test/e2e: improve nginx deployment test #2044

Open wainersm opened 2 months ago

wainersm commented 2 months ago

The nginx deployment test isn't completely stable. Sometimes it fails and we don't know yet whether it's a legit bug on peer pods or not. With this PR I don't solve that problem either, as I still don't know the roots of the problem, however, it improves the test that it will run the Teardown() function even when WithSetup() failed. We have this suspect that the left nginx deployment might be messing with the next tests, so this change will at least solve that problem (if it's really a problem). Regardless, tests should always delete their resources at the execution end.

There is one thing that I'd like to implement, which is, do not run Teardown() if not running on CI. In other words, if I'm running the test locally and it fails I want to have the deployment running so I can inspect. I may be sending this on another PR or maybe even on this one....depending on the reviews.

mkulke commented 2 months ago

i don't have a lot of context, but in my (manual) tests I'm always using deployments. but, essentially, deployments are higher-order kubernetes constructs, as a primitive they should not be relevant for CAA. Thus, if we hava a Deployment test that will spawn Pods w/ a Pod spec that is 1:1 matching the Spec for a (reliable) Pod test, I'd conclude that there are problems in our testing suite (race conditions probably). Maybe it makes sense to remodel the deployment tests to mirror the pod test closely.

mkulke commented 2 months ago

I also noticed that the test cause side-effects that makes it impossible to run them in parallel. Essentially the asserts of a test find leftovers of services or other k8s objects, because they use the same name. So, it might make sense to create some logic that will make a test independent, because there is a random segment in the names

stevenhorsman commented 2 months ago

I also noticed that the test cause side-effects that makes it impossible to run them in parallel. Essentially the asserts of a test find leftovers of services or other k8s objects, because they use the same name. So, it might make sense to create some logic that will make a test independent, because there is a random segment in the names

We currently have a different namespace for each test run. We picked that to make it easier to clean everything up together if it went wrong, but if you think it would help to have a random namespace for every tests, then we could consider that?

mkulke commented 2 months ago

We currently have a different namespace for each test run. We picked that to make it easier to clean everything up together if it went wrong, but if you think it would help to have a random namespace for every tests, then we could consider that?

yeah, ideally, within a test-run the tests should also be independent, so maybe running them in their own namespace would indeed be a good measure, it'll also help with cleanup. we can use a label: test-run: 123 for the ns, so we can bulk delete in a tear-down section.

wainersm commented 1 month ago

Hi @mkulke ,

i don't have a lot of context, but in my (manual) tests I'm always using deployments. but, essentially, deployments are higher-order kubernetes constructs, as a primitive they should not be relevant for CAA. Thus, if we hava a Deployment test that will spawn Pods w/ a Pod spec that is 1:1 matching the Spec for a (reliable) Pod test, I'd conclude that there are problems in our testing suite (race conditions probably). Maybe it makes sense to remodel the deployment tests to mirror the pod test closely.

I don't remember the context that nginx deployment test was added, the original commit doesn't explain either, so I don't know for sure. Well, it's exercising ReadinessProbe and LivenessProbe, maybe Probes are relevant for peer pods implementation? Let's suppose "yes", do the test should be a Deployment rather than a regular Pod?

In the end, I'm trying to understand whether I should address the static checker errors aiming to get this merged OR drop the nginx deployment test OR take another action :)

wainersm commented 1 month ago

We currently have a different namespace for each test run. We picked that to make it easier to clean everything up together if it went wrong, but if you think it would help to have a random namespace for every tests, then we could consider that?

yeah, ideally, within a test-run the tests should also be independent, so maybe running them in their own namespace would indeed be a good measure, it'll also help with cleanup. we can use a label: test-run: 123 for the ns, so we can bulk delete in a tear-down section.

Makes sense to me. Created an issue to discuss that and other stuffs for running the tests in parallel: https://github.com/confidential-containers/cloud-api-adaptor/issues/2092