Open AdamBrousseau opened 2 weeks ago
OpenJ9 builds require no libraries (.so
files), only a subset of CUDA header files (see the COPY --from=nvidia/cuda:9.0-devel-ubuntu16.04 ...
line above).
Test machines only need the CUDA driver (libcuda.so
and any requisite kernel module) and the runtime library (libcudartNN.so
).
Right, sorry. Header files not libs.
Line in the pipeline where the docker image gets built https://github.com/ibmruntimes/ci-jenkins-pipelines/blob/d439f31275a0da6510ca946e91ea0738742df368/pipelines/build/common/openjdk_build_pipeline.groovy#L2428C44-L2428C188
Details:
OpenJ9 compiles require an Nvidia Cuda lib (or few) on Linux (ppc64le and x64) in order to compile with cuda support. There is a mechanism in the Adopt pipelines to add on to the Adopt build image by rebuilding with the libs copied from the nvidia container[1][2][3]. I believe this is a different requirement from the tests needing the cuda tookit install [4] (Related #3581). I suspect that when the PBs were setup and the build scripts were originally written, the requirement was thought to be one in the same or there was maybe enough confusion that we ended up adding a skip-tag to the cuda role when we build the docker build images[5] in order to minimize the image size.
My proposal is that another PB is created that just adds those few lib(s) we need for compile machines/containers and we don't need to skip it for the docker builds. This would allow us to not do the workaound in the build pipelines. It would also allow us to use the build images in our other set of OpenJ9 pipeline builds without having to build in this extra mechanism to add the lib(s) on the fly. At the moment we maintain one of our own containers with it built in but we'd like to switch over to Adopt's container.
cc @keithc-ca
Slack with @sxa on this topic https://adoptium.slack.com/archives/C09NW3L2J/p1717421779597569
[1] https://github.com/ibmruntimes/ci-jenkins-pipelines/blob/191f1ffe1fdc96b94a15035e1fd5361ce7659ce7/pipelines/jobs/configurations/jdk11u_pipeline_config.groovy#L25 [2] https://github.com/ibmruntimes/ci-jenkins-pipelines/blob/ibm/pipelines/build/dockerFiles/cuda.dockerfile [3] https://ci.adoptium.net/job/build-scripts/job/jobs/job/jdk11u/job/jdk11u-linux-x64-openj9/1410/consoleFull
[4] https://github.com/adoptium/infrastructure/blob/master/ansible/playbooks/AdoptOpenJDK_Unix_Playbook/roles/NVidia_Cuda_Toolkit/tasks/main.yml
[5] https://github.com/adoptium/infrastructure/blob/c96f2d57b511e888cd465e01a7433199b776ab73/ansible/docker/Dockerfile.CentOS7#L15