openshift / openshift-azure

Azure Red Hat Openshift
https://azure.microsoft.com/en-us/services/openshift/
Apache License 2.0
49 stars 51 forks source link

Prep v20 development #2318

Closed ehashman closed 4 years ago

ehashman commented 4 years ago
NONE
codecov[bot] commented 4 years ago

Codecov Report

Merging #2318 into master will not change coverage. The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #2318   +/-   ##
=======================================
  Coverage   41.36%   41.36%           
=======================================
  Files         305      305           
  Lines       23962    23962           
=======================================
  Hits         9912     9912           
  Misses      13470    13470           
  Partials      580      580           
nilsanderselde commented 4 years ago

/lgtm cancel

nilsanderselde commented 4 years ago

/lgtm

openshift-bot commented 4 years ago

/retest

Please review the full test history for this PR and help us cut down flakes.

ehashman commented 4 years ago

/test e2e-create-20191027-private

openshift-bot commented 4 years ago

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot commented 4 years ago

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot commented 4 years ago

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot commented 4 years ago

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot commented 4 years ago

/retest

Please review the full test history for this PR and help us cut down flakes.

ehashman commented 4 years ago

/test upgrade-v16.1-private

openshift-ci-robot commented 4 years ago

@ehashman: The specified target(s) for /test were not found. The following commands are available to trigger jobs:

Use /test all to run the following jobs:

In response to [this](https://github.com/openshift/openshift-azure/pull/2318#issuecomment-656916881): >/test upgrade-v16.1-private Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
ehashman commented 4 years ago

/test upgrade-private-v16.1

ehashman commented 4 years ago

Based on the PR test history here, I have a suspicion that we've somehow hit a private cluster regression.

Successful e2e-create-private: https://deck-ci.apps.ci.l2s4.p1.openshiftapps.com/view/gs/origin-ci-test/pr-logs/pull/openshift_openshift-azure/2316/pull-ci-azure-master-e2e-create-20191027-private/1280653780215402496#1:build-log.txt%3A50

time="2020-07-08T00:28:44Z" level=info msg="check PE existence" func="pkg/fakerp.GetDeployer.func1()" file="pkg/fakerp/fakerp.go:111"
time="2020-07-08T00:28:44Z" level=info msg="applying PLS deployment" func="pkg/fakerp.GetDeployer.func1()" file="pkg/fakerp/fakerp.go:118"
time="2020-07-08T00:28:45Z" level=info msg="waiting for arm template deployment to complete" func="pkg/fakerp.GetDeployer.func1()" file="pkg/fakerp/fakerp.go:134"
time="2020-07-08T00:29:26Z" level=info msg="applying PE deployment" func="pkg/fakerp.GetDeployer.func1()" file="pkg/fakerp/fakerp.go:142"
time="2020-07-08T00:29:27Z" level=info msg="waiting for arm template deployment to complete" func="pkg/fakerp.GetDeployer.func1()" file="pkg/fakerp/fakerp.go:158"
time="2020-07-08T00:30:07Z" level=info msg="get PE IP address" func="pkg/fakerp.GetDeployer.func1()" file="pkg/fakerp/fakerp.go:167"
time="2020-07-08T00:30:08Z" level=debug msg="PE IP Address 172.30.17.4 " func="pkg/fakerp.GetDeployer.func1()" file="pkg/fakerp/fakerp.go:172"
time="2020-07-08T00:30:09Z" level=info msg="waiting for API server healthz" func="pkg/cluster.(*Upgrade).WaitForHealthzStatusOk()" file="pkg/cluster/healthcheck.go:41"
time="2020-07-08T00:30:19Z" level=debug msg="ForHTTPStatusOk: will retry on the following error Get https://10.0.0.254/healthz: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)" func="pkg/util/wait.ForHTTPStatusOk.func1()" file="pkg/util/wait/wait.go:71"
time="2020-07-08T00:30:30Z" level=debug msg="ForHTTPStatusOk: will retry on the following error Get https://10.0.0.254/healthz: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)" func="pkg/util/wait.ForHTTPStatusOk.func1()" file="pkg/util/wait/wait.go:71"
time="2020-07-08T00:30:41Z" level=debug msg="ForHTTPStatusOk: will retry on the following error Get https://10.0.0.254/healthz: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)" func="pkg/util/wait.ForHTTPStatusOk.func1()" file="pkg/util/wait/wait.go:71"
time="2020-07-08T00:30:51Z" level=info msg="updating sync pod" func="pkg/cluster.(*Upgrade).CreateOrUpdateSyncPod()" file="pkg/cluster/update_syncpod.go:9"
time="2020-07-08T00:30:54Z" level=info msg="waiting for master-000000 to be ready" func="pkg/cluster.(*Upgrade).WaitForNodesInAgentPoolProfile()" file="pkg/cluster/ready.go:19" 

Example of the failures seen here: https://deck-ci.apps.ci.l2s4.p1.openshiftapps.com/view/gs/origin-ci-test/pr-logs/pull/openshift_openshift-azure/2318/pull-ci-azure-master-e2e-create-20191027-private/1281072160361680896#1:build-log.txt%3A50

time="2020-07-09T04:04:21Z" level=info msg="check PE existence" func="pkg/fakerp.GetDeployer.func1()" file="pkg/fakerp/fakerp.go:111"
time="2020-07-09T04:04:24Z" level=info msg="waiting for API server healthz" func="pkg/cluster.(*Upgrade).WaitForHealthzStatusOk()" file="pkg/cluster/healthcheck.go:41"
time="2020-07-09T04:04:34Z" level=debug msg="ForHTTPStatusOk: will retry on the following error Get https://10.0.0.254/healthz: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)" func="pkg/util/wait.ForHTTPStatusOk.func1()" file="pkg/util/wait/wait.go:71" 
... [times out]

I'm not going to spend too much time trying to unblock this right now, but I get the feeling this regression may exist on the release-v19 branch since the branch CI currently does not gate on private cluster creation.

ehashman commented 4 years ago

(also, looking at the code in question, it may very well just be a fakerp bug as opposed to an RP/plugin issue)

ehashman commented 4 years ago

OH NO IT'S MUCH DUMBER THAN THAT

https://github.com/openshift/openshift-azure/blob/4205d734414be919df805c5599743a17715cfc65/pkg/fakerp/fakerp.go#L111-L116

I will try to fix this.

openshift-ci-robot commented 4 years ago

@ehashman: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
upgrade-private-v16.1 802de555dd24eaf1005a0c97967d0ec0489b0088 link /test upgrade-private-v16.1

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).
ehashman commented 4 years ago

/retest

openshift-ci-robot commented 4 years ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ehashman, nilsanderselde

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/openshift/openshift-azure/blob/master/OWNERS)~~ [ehashman] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment