openshift / jenkins-plugin

Apache License 2.0
81 stars 50 forks source link

Build trigger fails waiting for logs #47

Closed rhamilto closed 8 years ago

rhamilto commented 8 years ago

this is the jenkins console output. @bparees to provide color.

OpenShift Build myproject/sample-pipeline-1
[Pipeline] node
Still waiting to schedule task
584400d9638 is offline
Running on 6a00e94b7c7 in /tmp/workspace/sample-pipeline
[Pipeline] {
[Pipeline] stage (build)
Entering stage build
Proceeding
[Pipeline] openshiftBuild

Starting the "Trigger OpenShift Build" step with build config "ruby-sample-build" from the project "myproject".
  Started build "ruby-sample-build-1" and waiting for build completion ...

Exiting "Trigger OpenShift Build" unsuccessfully; build "ruby-sample-build-1" has completed with status:  [null].
[Pipeline] }
[Pipeline] // node
[Pipeline] End of Pipeline
ERROR: "Trigger OpenShift Build" failed
Finished: FAILURE
bparees commented 8 years ago

@gabemontero ptal, this build took a long time to start, possibly because it was pulling the image. It appears in the meantime something timed out in the plugin and it decided the build was failed (with a null status). I think it may have timed out waiting for the logs to start, because the build step is configured to stream the logs back.

gabemontero commented 8 years ago

Can we re-run this with verbose logging turned on (see the readme for instructions)?

There are configurable timeout settings (also explained in the readme). We could explore changing the default if the new pipeline stuff needs more elbow room.

On Thursday, July 7, 2016, Ben Parees notifications@github.com wrote:

Assigned #47 https://github.com/openshift/jenkins-plugin/issues/47 to @gabemontero https://github.com/gabemontero.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openshift/jenkins-plugin/issues/47#event-715790132, or mute the thread https://github.com/notifications/unsubscribe/ADbadP9W5BKYknI1u2KDsySg68Nxy2hwks5qTQz-gaJpZM4JHIZd .

bparees commented 8 years ago

it's probably going to be hard to recreate since it only happened when the image was being pulled (@rhamilto you could try docker rmi'ing the image locally so it has to be re-pulled next time).

at a minimum it seems like if we hit a timeout we need a better status report.

what is the default timeout?

bparees commented 8 years ago

(I don't think the pipeline stuff is causing the timeout to need to be longer, since this was still just a straightforward "trigger a build and wait for it" build step)

rhamilto commented 8 years ago

@bparees, i docker rmi-ed the image (and dependent images) and was able to reproduce the build failure on the initial build. The subsequent second build was successful (as it was earlier).

gabemontero commented 8 years ago

@bparees - the default build timeout is 5 minutes.

Pending an update to that amount, I can certainly work on having the error message indicate that we quit waiting on the build to complete. I'll start on that in a bit.

bparees commented 8 years ago

@gabemontero that's just a timeout waiting for the build to start, not a timeout waiting for the build to complete, right?

gabemontero commented 8 years ago

@bparees no, it currently is the timeout to complete.

bparees commented 8 years ago

@gabemontero I think we should raise that default to at least 15 mins then. (Along w/ making the error clear about why we gave up).

gabemontero commented 8 years ago

@bparees will do ... I may have found a scenario where we quit waiting too soon. Addressing that will be part of the change.

gabemontero commented 8 years ago

My proposal on the new message:

Exiting "Trigger OpenShift Build" unsuccessfully; build "ruby-sample-build-1" : did not complete successfully within the configured timeout; last reported status: [NotStarted].

bparees commented 8 years ago

can we print the timeout value in the message?

otherwise lgtm

gabemontero commented 8 years ago

Sure i'll add that

On Monday, July 11, 2016, Ben Parees notifications@github.com wrote:

can we print the timeout value in the message?

otherwise lgtm

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openshift/jenkins-plugin/issues/47#issuecomment-231854874, or mute the thread https://github.com/notifications/unsubscribe/ADbadKZ-6C4mWNvUXcMjyksaaGEBd2Klks5qUqfRgaJpZM4JHIZd .

gabemontero commented 8 years ago

Fixes introduced with https://github.com/openshift/jenkins-plugin/commit/9e4c19fded368e5b931342f156f2e93b5350f8a9

A pre-release version of the plugin with the fix is located here

V1.0.21 will be the released version with this fix.