operator-framework / operator-controller

Apache License 2.0
28 stars 47 forks source link

🌱 Add test for fetching bundles #951

Closed m1kola closed 1 week ago

m1kola commented 2 weeks ago

Description

Cover what happens when we fail to fetch bundles from catalog.

This should solve flaky test coverage. It looks like in E2E we sometimes are hitting this condition and sometimes we don't which results in coverage fluctuation:

Screenshot 2024-06-17 at 16 37 28

Reviewer Checklist

netlify[bot] commented 2 weeks ago

Deploy Preview for olmv1 ready!

Name Link
Latest commit cd4084bed9d8be27b1a7ad4c2ff0aaa4d91d737c
Latest deploy log https://app.netlify.com/sites/olmv1/deploys/6673f1e00f968500084ec927
Deploy Preview https://deploy-preview-951--olmv1.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

codecov[bot] commented 2 weeks ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 80.14%. Comparing base (bfd4142) to head (cd4084b). Report is 4 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #951 +/- ## ======================================= Coverage 80.14% 80.14% ======================================= Files 16 16 Lines 1103 1103 ======================================= Hits 884 884 Misses 152 152 Partials 67 67 ``` | [Flag](https://app.codecov.io/gh/operator-framework/operator-controller/pull/951/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=operator-framework) | Coverage Ξ” | | |---|---|---| | [e2e](https://app.codecov.io/gh/operator-framework/operator-controller/pull/951/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=operator-framework) | `58.20% <ΓΈ> (ΓΈ)` | | | [unit](https://app.codecov.io/gh/operator-framework/operator-controller/pull/951/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=operator-framework) | `54.03% <ΓΈ> (+0.18%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=operator-framework#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

joelanford commented 1 week ago

It looks like in E2E we sometimes are hitting this condition and sometimes we don't which results in coverage fluctuation

Adding more test coverage is good, but it seems pretty bad that other e2e tests are non-deterministic in this way. Did you dive into this enough to understand why the flakiness is happening?

m1kola commented 1 week ago

@joelanford Here is what I think happens:

  1. We create a catalog.
  2. We run a test.
  3. We sometimes hit "error fetching bundles" and sometimes not. When we do hit it - it gets coverage.
  4. E2Es wait for a certain condition and which gives the controller a chance to requeue the reconcile request. Eventually reconcile succeeds.

I think error happens because in tests we create catalogs and we do not wait for them to become ready: we create ClusterExtension immediately after creating a ClusterCatalog. So most likely ClusterExtension controller attempts to reconcile while catalog is still unpacking/not ready.

We can look into adding waits after creating ClusterCatalog, but IMO it is good that we do not have these artificial waits: we should be able to reconcile once catalog becomes ready.

joelanford commented 1 week ago

+100000 to not artificially wait. Expecting eventual consistency in our e2e's is the correct approach.

This makes sense. Thanks for the explanation.