kubernetes-sigs / cluster-api-provider-openstack

Cluster API implementation for OpenStack
https://cluster-api-openstack.sigs.k8s.io/
Apache License 2.0
275 stars 253 forks source link

:seedling: Ensure E2E cleanup #2005

Closed mquhuy closed 2 months ago

mquhuy commented 3 months ago

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged): Fixes #1801

Special notes for your reviewer:

  1. Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

TODOs:

k8s-ci-robot commented 3 months ago

Hi @mquhuy. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
netlify[bot] commented 3 months ago

Deploy Preview for kubernetes-sigs-cluster-api-openstack ready!

Name Link
Latest commit 48c952ff6c764a6e90158d33e4b2e3a4fb41cdc8
Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-cluster-api-openstack/deploys/6619338f50e1850008b5779c
Deploy Preview https://deploy-preview-2005--kubernetes-sigs-cluster-api-openstack.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

mquhuy commented 3 months ago

/test ?

k8s-ci-robot commented 3 months ago

@mquhuy: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to [this](https://github.com/kubernetes-sigs/cluster-api-provider-openstack/pull/2005#issuecomment-2044567543): >/test ? Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
lentzi90 commented 3 months ago

/ok-to-test

mdbooth commented 3 months ago

I like this! I'm slightly nervous of what this is going to find when it's working 😬

Looks like it's failing because e2eCtx hasn't been initialized trying to dump the initial set of resources:

> Enter [SynchronizedBeforeSuite] TOP-LEVEL - /home/prow/go/src/sigs.k8s.io/cluster-api-provider-openstack/test/e2e/suites/e2e/e2e_suite_test.go:66 @ 04/09/24 15:18:52.03
[PANICKED] Test Panicked
In [SynchronizedBeforeSuite] at: /usr/local/go/src/runtime/panic.go:261 @ 04/09/24 15:18:52.031

runtime error: invalid memory address or nil pointer dereference

Full Stack Trace
  sigs.k8s.io/cluster-api/test/framework/clusterctl.(*E2EConfig).GetVariable(0x0, {0x267ba00, 0xf})
    /root/go/pkg/mod/sigs.k8s.io/cluster-api/test@v1.6.0/framework/clusterctl/e2e_config.go:580 +0x3b
  sigs.k8s.io/cluster-api-provider-openstack/test/e2e/shared.GetTenantProviderClient(0xc00015e160)
    /home/prow/go/src/sigs.k8s.io/cluster-api-provider-openstack/test/e2e/shared/openstack.go:626 +0x2b
  sigs.k8s.io/cluster-api-provider-openstack/test/e2e/shared.DumpOpenStackServers(_, {{0x0, 0x0}, {0x0, 0x0}, {0x0, 0x0}, {0x0, 0x0}, {0x0, ...}, ...})
    /home/prow/go/src/sigs.k8s.io/cluster-api-provider-openstack/test/e2e/shared/openstack.go:252 +0x25
  sigs.k8s.io/cluster-api-provider-openstack/test/e2e/suites/e2e.init.func5({0xc000839000, 0x2ebb, 0x2ebb})
    /home/prow/go/src/sigs.k8s.io/cluster-api-provider-openstack/test/e2e/suites/e2e/e2e_suite_test.go:70 +0x5e
  reflect.Value.call({0x2136120?, 0x27b51e8?, 0x13?}, {0x266c731, 0x4}, {0xc0007f00f0, 0x1, 0x1?})
    /usr/local/go/src/reflect/value.go:596 +0xca6
  reflect.Value.Call({0x2136120?, 0x27b51e8?, 0x2ebb?}, {0xc0007f00f0?, 0x2666c80?, 0x0?})
    /usr/local/go/src/reflect/value.go:380 +0xb9
< Exit [SynchronizedBeforeSuite] TOP-LEVEL - /home/prow/go/src/sigs.k8s.io/cluster-api-provider-openstack/test/e2e/suites/e2e/e2e_suite_test.go:66 @ 04/09/24 15:18:52.031 (0s)
mdbooth commented 3 months ago

I tested my suggested fix in #2007, and it nearly passes! In fact, it looks like we probably just need a custom matcher. It looks like the reported difference in networks isn't relevant. I can see at a glance that the subnets are in the opposite order. There may be other differences.

I wonder if we just want to compare the list of UUIDs?

mdbooth commented 3 months ago

I wrote a custom matcher over in #2007. You can see an example test failure in https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/kubernetes-sigs_cluster-api-provider-openstack/2007/pull-cluster-api-provider-openstack-e2e-test/1777791653365420032

I think it looks... ok. This would probably fly, tbh, although ideally we'd have some way to dump all the match failures after a run rather than stopping on the first one.

Feel free to take that code if you want. I am not planning to merge #2007, it was just for playing with your PR.

mquhuy commented 3 months ago

Hi @mdbooth . Thank you a lot! I'll have a look and come back asap xD

mdbooth commented 3 months ago

I added the 'test everything and report after' too. You can see the results here: https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/kubernetes-sigs_cluster-api-provider-openstack/2007/pull-cluster-api-provider-openstack-e2e-test/1777826551497232384

With this change the reported failure is simply:

{Not all resources were cleaned up failed [FAILED] Not all resources were cleaned up
In [SynchronizedAfterSuite] at: /home/prow/go/src/sigs.k8s.io/cluster-api-provider-openstack/test/e2e/suites/e2e/e2e_suite_test.go:124 @ 04/09/24 23:29:56.707
}

And the actual leaked resources are in the logs. Honestly this wasn't my intention, but I think I like it?

mdbooth commented 3 months ago

Incidentally, the advantage of using a custom matcher on whole objects rather than just matching on UUIDs is that you get the whole object in the failure message, e.g.:

the extra elements were
    <[]groups.SecGroup | len:1, cap:1>: [
        {
            ID: "7f1d446f-3762-4f6a-9a18-2e6704e60d03",
            Name: "testSecGroup",
            Description: "Test security group",
            Rules: [
                {
                    ID: "6e0a5a67-4249-4f40-be46-9df343761043",
                    Direction: "egress",
                    Description: "",
                    EtherType: "IPv4",
                    SecGroupID: "7f1d446f-3762-4f6a-9a18-2e6704e60d03",
                    PortRangeMin: 0,
                    PortRangeMax: 0,
                    Protocol: "",
                    RemoteGroupID: "",
                    RemoteIPPrefix: "",
                    TenantID: "5e83f484071a448dabc9570ec37c44de",
                    ProjectID: "5e83f484071a448dabc9570ec37c44de",
                },
                {
                    ID: "71158c32-b1ca-45b5-a9f9-8fbdffc5b31d",
                    Direction: "egress",
                    Description: "",
                    EtherType: "IPv6",
                    SecGroupID: "7f1d446f-3762-4f6a-9a18-2e6704e60d03",
                    PortRangeMin: 0,
                    PortRangeMax: 0,
                    Protocol: "",
                    RemoteGroupID: "",
                    RemoteIPPrefix: "",
                    TenantID: "5e83f484071a448dabc9570ec37c44de",
                    ProjectID: "5e83f484071a448dabc9570ec37c44de",
                },
            ],
            TenantID: "5e83f484071a448dabc9570ec37c44de",
            UpdatedAt: 2024-04-09T23:11:17Z,
            CreatedAt: 2024-04-09T23:11:17Z,
            ProjectID: "5e83f484071a448dabc9570ec37c44de",
            Tags: [],
        },
    ]
mquhuy commented 2 months ago

Hi @mdbooth . Thank you for all the work! I've stolen all of the implementation and updated the PR here.

k8s-ci-robot commented 2 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lentzi90

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/main/OWNERS)~~ [lentzi90] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
mdbooth commented 2 months ago

/lgtm

mquhuy commented 2 months ago

/unhold