coreos / coreos-ci

CoreOS CI powers upstream testing for CoreOS projects.
https://jenkins-coreos-ci.apps.ocp.ci.centos.org/
6 stars 10 forks source link

jenkins: increase HEARTBEAT_CHECK_INTERVAL to 10 mins and turn on LAUNCH_DIAGNOSTICS #30

Closed jlebon closed 3 years ago

jlebon commented 3 years ago
commit f09d9d8710152044738d3db2aecb44ac28331caa
Author: Jonathan Lebon <jonathan@jlebon.com>
Date:   Thu Apr 8 12:04:31 2021 -0400

    jenkins: increase HEARTBEAT_CHECK_INTERVAL to 10 mins

    The durable-task-plugin sometimes errors with "process apparently never
    started" if the cluster is under heavy load because the script hasn't
    had a chance to start yet. The default timeout is 5 mins, but let's try
    to bump that to 10 mins to see if it helps. It may be that the problem
    is elsewhere, in which case we can revert this.

    Closes: #28 (hopefully)
commit d368b0da122955d18c21de9109ba1ade62914f5f
Author: Jonathan Lebon <jonathan@jlebon.com>
Date:   Thu Apr 8 12:08:24 2021 -0400

    jenkins: turn on LAUNCH_DIAGNOSTICS for durable-task-plugin

    Let's follow what the helper message says and turn this on so that we
    get more logging about what goes wrong when it errors out with "process
    apparently never started".
jlebon commented 3 years ago

Will aim to roll this out at the end of the day to avoid disrupting CI.