operator-framework / operator-lifecycle-manager

A management framework for extending Kubernetes with Operators
https://olm.operatorframework.io
Apache License 2.0
1.72k stars 545 forks source link

Contributor Experience Improvements #2571

Open perdasilva opened 2 years ago

perdasilva commented 2 years ago

Epic Description

It's been a bit hard to get some PRs through lately. The main culprit seems to be the flaky tests although some other areas of improvement have also been identified. This epic collects the different improvements we'd like to target with the goal of delivering a better experience to our contributors: the information they get back from the PR checks is accurate and actionable, reduce PR lead time, reduce waste (retesting the whole suite, when one test fails), improve communication back to the contributor.

Here are the current suggestions for sub-stories. Once we agree on them, we can turn these into issues.

timflannagan commented 2 years ago

Referencing https://github.com/operator-framework/operator-lifecycle-manager/pull/2520/ which is another iteration for speeding up the e2e suite's execution by parallelizing test case chunks into their own cluster. Previous work on this topic that we didn't pursue further is https://github.com/operator-framework/operator-lifecycle-manager/pull/1476.

timflannagan commented 2 years ago

~@perdasilva Should we track utilizing https://github.com/operator-framework/operator-lifecycle-manager/pull/2527 more throughout the testing suite as well? Or try migrating any grpc-based CatalogSources that reference remote container images, to housing that index image as FBC in the repository?~ Whoops, I was commenting on the wrong issue.

timflannagan commented 2 years ago

Another thing we can pursue to improve contributing experience is through improving the debugging experience of failed e2e runs such that an individual test case failure triggers the collection of OLM-related resources in a namespace (or collecting logs of the catalog/olm operators). #2519 is one instance that can be used to collect test fixtures, container logs, etc. We can also continue to utilize the SetupGeneratedTestNamespace and TeardownNamespace which help with this workflow.

fgiloux commented 2 years ago

Additional suggestions:

some background here: https://github.com/operator-framework/operator-lifecycle-manager/pull/2504#issuecomment-984722191

akihikokuroda commented 2 years ago

My 2 cents,

Improve the developer documentation

Provide the container image for the build and unit test so that the developers don't need to install any tools on their system except the container runtime environment.

perdasilva commented 2 years ago

Hey @akihikokuroda @fgiloux @timflannagan thank you for your suggestions. I've tried to capture them in the stories/tasks above. Could I ask you guys to help me fill out the tickets:

@akihikokuroda #2596 #2587 @timflannagan #2593 @fgiloux #2594 #2598

Could I also please ask you to follow the format, e.g. #2588, i.e. description + acceptance criteria (try to keep it tightly scoped) + suggested implementation (optional)

If you have time to review the ones I've created, I'd also appreciate some feedback