openshift / jenkins-plugin

Apache License 2.0
81 stars 50 forks source link

add diagnostic logging for test connection #159

Closed gabemontero closed 6 years ago

gabemontero commented 6 years ago

debug for https://github.com/openshift/origin/issues/17488

@openshift/sig-developer-experience fyi

and note, the PR testing will invoke the same test case on the ci system that failed in https://github.com/openshift/origin/issues/17488 with this change .... if the failure is at all more than completely random, we should be able to diagnose here

I still think it is useful to add these logs (they show up in the jenkins log, not the job log), but not merging this change but still diagnosing https://github.com/openshift/origin/issues/17488 is an option

gabemontero commented 6 years ago

A couple of the tests failed due to env constraints reported by the kubelet:

Nov 28 18:19:27.780: INFO: At 2017-11-28 18:18:31 +0000 UTC - event for frontend-1: {build-controller } BuildCancelled: Build extended-test-jenkins-plugin-8hwvn-5zvq6/frontend-1 has been cancelled
Nov 28 18:19:27.780: INFO: At 2017-11-28 18:18:41 +0000 UTC - event for frontend-1-build: {kubelet ip-172-18-3-198.ec2.internal} Killing: Killing container with id docker://sti-build:Need to kill Pod
Nov 28 18:13:52.012: INFO: centos-1-deploy                       ip-172-18-3-198.ec2.internal  Failed            [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2017-11-28 18:03:52 +0000 UTC  } {Ready False 0001-01-01 00:00:00 +0000 UTC 2017-11-28 18:04:30 +0000 UTC ContainersNotReady containers with unready status: [deployment]} {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2017-11-28 18:03:52 +0000 UTC  }]

Our test connection test failed as expected, with the the expected java stack trace in the jenkins logs; and our new prints.

I'll try a few more test runs in this PR, see if we catch the failure, prior to any merges.

gabemontero commented 6 years ago

last failure seems to be some more random env pain ... a build ran successfully, but we started getting 404's accessing the job logs from jenkins ... around the same time the kubelet started killing some pods, again presumably for resource pain

the test connection test passed again, the expected stack trace was there on the bad connection attempt, and the debug from this PR looked good

will continue with a few runs today, see if I can repro the test connection bug (it ran clean in last night's official run) and then merge

gabemontero commented 6 years ago

[test]

openshift-bot commented 6 years ago

Evaluated for jenkins plugin test up to 2000018cf1fcc92267bb609f6177c9d2534f67fe

openshift-bot commented 6 years ago

continuous-integration/openshift-jenkins-plugin/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_jenkins_plugin/14/) (Base Commit: e98ce3d5d5c993f63b4bcf66707489bf99f9f8a9) (PR Branch Commit: 2000018cf1fcc92267bb609f6177c9d2534f67fe)

gabemontero commented 6 years ago

[merge]

openshift-bot commented 6 years ago

continuous-integration/openshift-jenkins-plugin/merge Waiting: You are in the build queue at position: 1

openshift-bot commented 6 years ago

Evaluated for jenkins plugin merge up to 2000018cf1fcc92267bb609f6177c9d2534f67fe