scala / scala-jenkins-infra

A Chef cookbook that manages Scala's CI infrastructure.
https://scala-ci.typesafe.com
Apache License 2.0
14 stars 17 forks source link

sbt generates enormous log files if compiler is broken #182

Closed adriaanm closed 8 years ago

adriaanm commented 8 years ago

example: https://scala-ci.typesafe.com/view/scala-2.12.x/job/scala-2.12.x-validate-test/2831/console

install https://wiki.jenkins-ci.org/display/JENKINS/Logfilesizechecker+Plugin as a backstop?

adriaanm commented 8 years ago

note that this affects both the jenkins master (the log file size checker plugin could serve as a backstop) and the worker (use linux-level?? mount /tmp on ephemeral storage? ulimit?)

adriaanm commented 8 years ago

/cc @smarter, @DarkDimius -- I've installed a plugin that aborts jobs when log files get over 15M in size. I checked and 99% of jobs are under 10M.

adriaanm commented 8 years ago

I'm feeling generous -- will up the cap to 30M 😎

(24 out of the current 35K jobs are over that threshold)

adriaanm commented 8 years ago

At the current average of 2M/log, this also means we're at about 2/3 of the capacity of the volume on which these logs are stored... The joys of system admin

SethTisue commented 8 years ago

we got a 5G log file today in /var/lib/jenkins/jobs/scala-2.12.x-validate-test/builds/2870 . perhaps Chef undid the installation of the plugin

SethTisue commented 8 years ago

ah, and I see that (as Adriaan alluded to above) this doesn't only cause problems on dedicated Jenkins partitions as old logs get archived — it wedges the worker even during the run, because /tmp fills up

SethTisue commented 8 years ago

https://scala-ci.typesafe.com/pluginManager/installed lists "build log file size checker plugin", so the plugin remains installed.

but e.g. https://scala-ci.typesafe.com/view/scala-2.12.x/job/scala-2.12.x-validate-test/configure had its "Abort the build if its log file size is too big" checkbox unchecked; perhaps that was Chef's doing

SethTisue commented 8 years ago

checking that checkbox, and the "fail the build" checkbox, adds this to config.xml:

    <hudson.plugins.logfilesizechecker.LogfilesizecheckerWrapper plugin="logfilesizechecker@1.2">
      <setOwn>false</setOwn>
      <maxLogSize>0</maxLogSize>
      <failBuild>true</failBuild>
    </hudson.plugins.logfilesizechecker.LogfilesizecheckerWrapper>
SethTisue commented 8 years ago

might other jobs be affected by the default 30M limit? Adriaan already checked, but I want to have a look too:

% cd /var/lib/jenkins/jobs
% du -s */builds/*/log| sort -nr | less
9622612 scala-2.12.x-validate-test/builds/2868/log
9525100 scala-2.12.x-validate-test/builds/2860/log
8170156 scala-2.12.x-validate-test/builds/2870/log
491460  dotty-master-validate-partest/builds/599/log
409016  dotty-master-validate-junit/builds/1758/log
299040  dotty-master-validate-partest/builds/1760/log
...

so there have been a few abnormally large (0.5G) Dotty logs, but only a few, it drops off very fast after that, so that matches Adriaan's previous finding

note that the build log is separate from any generated "artifacts", so e.g. 2.11.x-integrate-ide generates 68M of test results but it goes in a separate file, not the main log. I assume the plugin doesn't affect that

SethTisue commented 8 years ago

it might still be good to address this at the sbt level, rather than relying on the log file size to abort the job. but having the log file size limit in place Jenkins-wide seems like a good thing to have regardless

SethTisue commented 8 years ago

the log Adriaan linked to (job 2831) shows what one of these log files looks like, but isn't full-sized (just 508K). if it matters (probably not?), an example of a complete gargantuan 7.8G log file is https://scala-ci.typesafe.com/job/scala-2.12.x-validate-test/2870/console . the full log is of course much too large to view in-browser, but you can find it on jenkins-master at /var/lib/jenkins/jobs/scala-2.12.x-validate-test/builds/2870/log