cyberark / secrets-provider-for-k8s

Cyberark secrets provider for k8s
Apache License 2.0
26 stars 11 forks source link

Pipeline flaky test failure on Error: ErrImagePull #319

Closed sigalsax closed 1 year ago

sigalsax commented 3 years ago

Summary

In our secrets provider pipeline we are experiencing flaky tests on helm tests. The failure occurs on Running './TEST_ID_17_helm_job_deploys_successfully.sh' and the error resembles the following:

Screen Shot 2021-05-24 at 10 10 30

The test failures only occur on OCP current 4.6

Steps to Reproduce

Steps to reproduce the behavior:

  1. Run master
  2. Wait for 4.6 build to finish
  3. See failure. If you don't see failure, run pipeline again

Reproducible

Version/Tag number

OCP 4.6

Environment setup

Pipeline build

Additional Information

-

sigalsax commented 3 years ago

Flakiness was previously 1 in every 2 runs was a failure and now it is 1 in every ~4

Tovli commented 3 years ago

Feedback from @ismarc : I dug into this some yesterday, and there are still docker pull secrets in use. The reason they're still in use is to pull images across projects for the tests. I believe that what needs to happen is the image-puller role needs to be assigned for the test project against the project the images were uploaded to rather than using the docker pull secret for access (edited)

The reason being that there is a max age on the docker pull secret after which it stops working, so performance of steps prior to it being finished being used can impact whether it expires or not

https://docs.openshift.com/container-platform/4.5/openshift_images/managing_images/using-image-pull-secrets.html#images-allow-pods-to-reference-images-across-projects_using-image-pull-secrets covers how to add a project as an image-puller for another project and I believe will fully resolve this issue