kubernetes-sigs / cluster-api-operator

Home for Cluster API Operator, a subproject of sig-cluster-lifecycle
https://cluster-api-operator.sigs.k8s.io
Apache License 2.0
162 stars 77 forks source link

Unable to run e2e tests locally #141

Open furkatgofurov7 opened 1 year ago

furkatgofurov7 commented 1 year ago

What steps did you take and what happened: When running e2e tests locally using Makefile targets (make test-e2e or make test-e2e-run) both will fail, IF you have never run the make-docker-build-e2e target on the repo (if you have an operator image with dev tag locally, you would not see this problem, but that is not the case for everyone) which builds an operator image with dev tag.

That is because make test-e2e-run sets E2E_OPERATOR_IMAGE to http://gcr.io/k8s-staging-capi-operator/cluster-api-operator:dev, spins up a kind cluster and loads that image into cluster:

Creating cluster "capi-operator-e2e" ...
 β€’ Ensuring node image (kindest/node:v1.27.0) πŸ–Ό  ...
 βœ“ Ensuring node image (kindest/node:v1.27.0) πŸ–Ό
 β€’ Preparing nodes πŸ“¦   ...
 βœ“ Preparing nodes πŸ“¦ 
 β€’ Writing configuration πŸ“œ  ...
 βœ“ Writing configuration πŸ“œ
 β€’ Starting control-plane πŸ•ΉοΈ  ...
 βœ“ Starting control-plane πŸ•ΉοΈ
 β€’ Installing CNI πŸ”Œ  ...
 βœ“ Installing CNI πŸ”Œ
 β€’ Installing StorageClass πŸ’Ύ  ...
 βœ“ Installing StorageClass πŸ’Ύ
  INFO: The kubeconfig file for the kind cluster is /var/folders/cz/q854zvyj34nccdhvq_4cxhd80000gp/T/e2e-kind2691374665
  INFO: Loading image: "gcr.io/k8s-staging-capi-operator/cluster-api-operator:dev"
  INFO: Image gcr.io/k8s-staging-capi-operator/cluster-api-operator:dev not present in local container image cache, will pull
  INFO: [WARNING] Unable to load image "gcr.io/k8s-staging-capi-operator/cluster-api-operator:dev" into the kind cluster "capi-operator-e2e": error pulling image "gcr.io/k8s-staging-capi-operator/cluster-api-operator:dev": failure pulling container image: Error response from daemon: manifest for gcr.io/k8s-staging-capi-operator/cluster-api-operator:dev not found: manifest unknown: Failed to fetch "dev" from request "/v2/k8s-staging-capi-operator/cluster-api-operator/manifests/dev".

Later on in the tests, operator deployment will not come up properly and fail:

state:
      waiting:
        message: Back-off pulling image "[gcr.io/k8s-staging-capi-operator/cluster-api-operator:dev](http://gcr.io/k8s-staging-capi-operator/cluster-api-operator:dev)"
        reason: ImagePullBackOff

What did you expect to happen: run make test-e2e and make test-e2e-run successfully

To reproduce:

# in case you have run the `make-docker-build-e2e` before
$docker rmi gcr.io/k8s-staging-capi-operator/cluster-api-operator:dev
$make test-e2e-run

Tests time out waiting for capi-operator-system/capi-operator-controller-manager deployment to be available

Additional information: I see we have 2 options in this case:

  1. passing make-docker-build-e2e target to make test-e2e-run so that we always build the image first before running e2e tests locally
  2. leave it to the user and document it properly somewhere mentioning that, running make-docker-build-e2e is a prerequisite for successfully running e2e tests locally

Any other suggestions?

Environment:

/kind bug [One or more /area label. See https://github.com/kubernetes-sigs/cluster-api-operator/labels?q=area for the list of labels]

Fedosin commented 1 year ago

Cluster API does nothing: https://github.com/kubernetes-sigs/cluster-api/blob/main/Makefile#L836

So, I think we just need to document this behavior and allow users to run "make docker-build-e2e && make test-e2e" if they don't have the image built.

furkatgofurov7 commented 1 year ago

Cluster API does nothing: https://github.com/kubernetes-sigs/cluster-api/blob/main/Makefile#L836

So, I think we just need to document this behavior and allow users to run "make docker-build-e2e && make test-e2e" if they don't have the image built.

yes, was checking it as well. Something like https://github.com/kubernetes-sigs/cluster-api/blob/f3e3bda15c62f6cecb235c943f4ff337ce4ab5d1/docs/book/src/developer/testing.md?plain=1#L178C1-L180 should suffice in that case, but not sure where we need to put them

furkatgofurov7 commented 1 year ago

/triage accepted /kind documentation /help /good-first-issue

k8s-ci-robot commented 1 year ago

@furkatgofurov7: This request has been marked as suitable for new contributors.

Guidelines

Please ensure that the issue body includes answers to the following questions:

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-good-first-issue command.

In response to [this](https://github.com/kubernetes-sigs/cluster-api-operator/issues/141): >/triage accepted >/kind documentation >/help >/good-first-issue Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
furkatgofurov7 commented 1 year ago

/remove-kind bug

Sajiyah-Salat commented 1 year ago

Hello I would like to take this up. Can you please let me know where should we put this text? I think here we could add it with # symbol. So that any new user dont get confused again.

k8s-triage-robot commented 1 month ago

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted