kubernetes-sigs / e2e-framework

A Go framework for end-to-end testing of components running in Kubernetes clusters.
Apache License 2.0
529 stars 103 forks source link

feat: make Config's Client() goroutine safe #367

Closed pmalek closed 7 months ago

pmalek commented 10 months ago

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #352

Special notes for your reviewer:

Would we like to extend the like over all the fields of Config or leave it as proposed and only cover Client()?

Does this PR introduce a user-facing change?

Config's Client() method is now goroutine safe

Additional documentation e.g., Usage docs, etc.:

k8s-ci-robot commented 10 months ago

Hi @pmalek. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
harshanarayana commented 10 months ago

/retest

k8s-ci-robot commented 9 months ago

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: pmalek Once this PR has been reviewed and has the lgtm label, please ask for approval from harshanarayana. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files: - **[OWNERS](https://github.com/kubernetes-sigs/e2e-framework/blob/main/OWNERS)** Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
ratschance commented 9 months ago

I think this may still have issues if two tests are running List operations on separate namespaces at the same time since the namespace is currently being stored in the underlying Resources object of the client

pmalek commented 9 months ago

I think this may still have issues if two tests are running List operations on separate namespaces at the same time since the namespace is currently being stored in the underlying Resources object of the client

* https://github.com/kubernetes-sigs/e2e-framework/blob/main/klient/client.go#L82

* https://github.com/kubernetes-sigs/e2e-framework/blob/main/klient/k8s/resources/resources.go#L53

You're right. It's easy to repro with something like this:

func TestClientIsGorountineSafe(t *testing.T) {
    cfg := New()
    wg := sync.WaitGroup{}
    wgCount := runtime.NumCPU()
    wg.Add(wgCount)

    const count = 10000
    for i := 0; i < wgCount; i++ {
        go func() {
            defer wg.Done()

            for i := 0; i < count; i++ {
                ns := fmt.Sprintf("ns-%4d", i)
                client := cfg.Client()
                err := client.Resources(ns).List(context.Background(), &corev1.PodList{})
                if err != nil {
                    t.Errorf("failed to list pods: %v", err)
                }
            }
        }()
    }
    wg.Wait()
}

Not sure what would be the best course of action for this one though 🤔. Sprinkling locks all around doesn't seem to be a good idea.

One thought that I have is to move

    res, err := resources.New(cfg)
    if err != nil {
        return nil, err
    }

from the resources from klient.New() to klient's client.Resources().

vladimirvivien commented 9 months ago

/retest

k8s-ci-robot commented 9 months ago

@pmalek: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-e2e-framework-test a021ee735d1d8095e4fec809e6f3482b023ca2c8 link true /test pull-e2e-framework-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).
Fricounet commented 9 months ago

@pmalek I might be wrong but I think the underlying issue here the same as what I outlined in https://github.com/kubernetes-sigs/e2e-framework/issues/384. The issue is that the cfg object (and the underlying client object) is shared across all tests because it is always the one stored in the testenv that is passed around everywhere as a reference. For me, we can solve this by using copies of the base cfg instead of using references (see issue). This way each test can have its own separate cfg with a separate namespace and a separate client, etc. What do you think?

pmalek commented 8 months ago

@pmalek I might be wrong but I think the underlying issue here the same as what I outlined in #384. The issue is that the cfg object (and the underlying client object) is shared across all tests because it is always the one stored in the testenv that is passed around everywhere as a reference. For me, we can solve this by using copies of the base cfg instead of using references (see issue). This way each test can have its own separate cfg with a separate namespace and a separate client, etc. What do you think?

@Fricounet This makes a lot of sense. I'm not 100% sure since I'm not well versed with this codebase (yet?) but the proposal to copy the config per test/feature/assess sounds nice.

Would you be willing to submit a PR for that?

Fricounet commented 8 months ago

@pmalek I won't be able to within the next 2 week but I can find the time to take a look after that if you want

pmalek commented 7 months ago

Will probably be superseded by #396 as that's a better approach (share nothing approach).