openshift / origin

Conformance test suite for OpenShift
http://www.openshift.org
Apache License 2.0
8.5k stars 4.71k forks source link

Flake: github.com/openshift/origin/test/end-to-end/core.test/end-to-end/core.sh:389 #14897

Closed tnozicka closed 7 years ago

tnozicka commented 7 years ago

github.com/openshift/origin/test/end-to-end/core.test/end-to-end/core.sh:389: executing 'cat '/data/src/github.com/openshift/origin/_output/scripts/test-end-to-end/logs/cli-with-token2.log'' expecting success and text 'system:serviceaccount:test:default' (from github.com_openshift_origin_test_end-to-end_core)

https://ci.openshift.redhat.com/jenkins/job/merge_pull_request_origin/1143/consoleFull

Stacktrace

=== BEGIN TEST CASE ===
test/end-to-end/core.sh:389: executing 'cat '/data/src/github.com/openshift/origin/_output/scripts/test-end-to-end/logs/cli-with-token2.log'' expecting success and text 'system:serviceaccount:test:default'
FAILURE after 0.016s: test/end-to-end/core.sh:389: executing 'cat '/data/src/github.com/openshift/origin/_output/scripts/test-end-to-end/logs/cli-with-token2.log'' expecting success and text 'system:serviceaccount:test:default': the output content test failed
Standard output from the command:
If you don't see a command prompt, try pressing enter.
I0627 04:47:26.039053       1 merged_client_builder.go:123] Using in-cluster configuration
I0627 04:47:26.041609       1 merged_client_builder.go:123] Using in-cluster configuration
I0627 04:47:26.076348       1 cached_discovery.go:134] failed to write cache to /root/.kube/172.30.0.1_443/servergroups.json due to mkdir /root/.kube: permission denied
I0627 04:47:26.077108       1 merged_client_builder.go:123] Using in-cluster configuration

There was no error output from the command.
=== END TEST CASE ===
pweil- commented 7 years ago

@fabianofranz actually, this looks like it may be another instance of the --attach issues we've seen in the past like https://github.com/openshift/origin/issues/12558 which was supposedly fixed in https://github.com/openshift/origin/pull/13669 but maybe just for test-end-to-end-docker.sh.

@soltysh PTAL.

soltysh commented 7 years ago

@pweil- correct, this is enabled only for test-end-to-docker.sh, but that's the default we'll fallback to when docker is installed, which is what happens in our CI (look for ++ Docker is installed, running hack/test-end-to-end-docker.sh instead). I'm worried this is back, but at this point in time I have no idea why, since the journald rate limiting is off (look for Turning off journald limits). It's not a blocker for 3.6, but I'll be paying attention, since it looks like we're hitting it more and more often recently 😞

soltysh commented 7 years ago

After IRC discussion with @pweil- I'll lower the prio. I need logs to verify what's going on and none of the current instances has those. It won't block the release in any way and I'll keep my eye open for it.

simo5 commented 7 years ago

flaked on https://github.com/openshift/origin/pull/14745 too

0xmichalis commented 7 years ago

@soltysh do you have everything you need in https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin_integration/4259/s3/ to debug this further?

bparees commented 7 years ago

also saw this here: https://github.com/openshift/origin/issues/15331 with a slightly different output.

bparees commented 7 years ago

and @mfojtik saw it here https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin/3264/ on a slightly different file (cli-with-token.log instead of cli-with-token2.log)

mfojtik commented 7 years ago

@bparees I can see:

I0719 15:31:37.212757       1 cached_discovery.go:88] failed to write cache to /root/.kube/172.30.0.1_443/authorization.openshift.io/v1/serverresources.json due to mkdir /root/.kube: permission denied
I0719 15:31:37.219134       1 cached_discovery.go:88] failed to write cache to /root/.kube/172.30.0.1_443/v1/serverresources.json due to mkdir /root/.kube: permission denied

@jupierce @kargakis @jhadvig anyone can check what are the permissions there?

mfojtik commented 7 years ago

@enj can you help investigate this? this seems like a number 1 flake atm.

mfojtik commented 7 years ago

I propose we disable this test for now to unblock merge/test queue: https://github.com/openshift/origin/pull/15363 objections?

bparees commented 7 years ago

no objections here, i just hit it again too.

soltysh commented 7 years ago

There's a discussion about journald limits happening in https://github.com/openshift/origin/issues/14785

stevekuznetsov commented 7 years ago

Is this just a dupe of the journald issue?

soltysh commented 7 years ago

Yeah, I'll close this one in favor of the other one.