kernelci / kernelci-build-staging

(DEPRECATED: check kernelci-core-staging) KernelCI build scripts for the staging jenkins instance
GNU Lesser General Public License v2.1
1 stars 5 forks source link

T9827 kernel build trigger #39

Closed gctucker closed 6 years ago

gctucker commented 6 years ago

Convert kernel-defconfig-creator as a Jenkins Pipeline job build-trigger.jpl and improve build.jpl:

gctucker commented 6 years ago

@ana The way I'm using the shared Jenkins Pipeline library feature is more explicit here than what you've done in the Debian rootfs jobs. Do you think you can move your code into org.kernelci.build.debian?

mattface commented 6 years ago

Before we get too far into the pipeline world, can we improve the logging output? Currently I find it almost impossible to know what section is being built where. For example, can we prefix log lines with the build node and executor?

Does this mean build-trigger will be executing for the entire time it takes to build every defconfig for every arch? If it has to wait for all the builds to finish per tree we will be blocking a lot of executors?

mattface commented 6 years ago

Also, sorry this will need rebasing as I rebased kernelci-build-staging on kernelci-build...

khilman commented 6 years ago

+1 on the logging improvements. Current logs are very difficult to decipher.

gctucker commented 6 years ago

This does solve the logging issue by running one build per job. I'm running this on staging now to verify it builds the same things as the current defconfig-creator. Yes the idea is to have one top-level pipeline job per kernel tree. It won't take many executors, and it has many advantages. Each tree creates tens or hundreds of build jobs, I doubt in practice we'll need more than 2 trees being built in parallel to utilise all the builders. But we can easily have 10 executors for the top-level nodes, and that could keep busy about 1000 builders continuously. One advantage is that it simplifies the logic of kernel-arch-complete, there's no need to hard-code 4 archs and check for temporary files etc... as the top-level job will just move on to the next stage when all the builds are complete. Going further, it could even trigger boot jobs as soon as builds are ready, and get the boot report as soon as all the boots have completed (or timed out). I can send an email to explain what I have in mind in more detail. It's nothing too special, really just what Pipeline jobs were designed for.

gctucker commented 6 years ago

@mattface No worries about the rebase, staging and production had the same code anyway so that's a no-op. I'll rebase this PR when I'm done with testing it.

ana commented 6 years ago

@gctucker Fully agree about migrating the debian rootfs code into org.kernelci.build.debian I'll work in a PR once this has been merged.

About the logging output part, regardless of other improvements, it'd be very helpful to install the blue ocean plugin. It makes the visualization of the pipelines easier and you can see the logs of every job filtered very easily.

gctucker commented 6 years ago

Tested on staging with lsk-4.4-rt as it's a special corner case with defconfigs, got 187 builds including 7 failures: https://staging.kernelci.org/build/lsk/branch/kernelci-test/kernel/kernelci-test-072-1-g542a5687363a/ which is 6 short of what was run on production: https://kernelci.org/build/lsk/branch/linux-linaro-lsk-v4.4-rt/kernel/lsk-v4.4-18.06-rt-108-g811191443dc5/ So I'll now try to see whether the defconfigs were not created as they should have, or if the jobs failed to run. It seems like it's working well overall, let me know if you have further comments.

gctucker commented 6 years ago

Tested again withlinux-linaro-lsk-v4.4-android as it goes through special corner cases (lsk android configs): https://staging.kernelci.org/build/lsk/branch/kernelci-test/kernel/kernelci-test-074-4-g084fff87b0ee/ Same build on production, showing the same results: https://kernelci.org/build/lsk/branch/linux-linaro-lsk-v4.4-android/kernel/lsk-v4.4-18.06-android-108-g7a3f6e8d2096/

The previous attempt showed a couple of builds failing, this has now been resolved by passing this parameter to Jenkins: -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.HEARTBEAT_CHECK_INTERVAL=7200

This is to work around a known issue in Jenkins: https://issues.jenkins-ci.org/browse/JENKINS-50379

Thanks @mattface for fixing this, please do the same on production before we get this deployed.