jenkins-infra / helpdesk

Open your Infrastructure related issues here for the Jenkins project
https://github.com/jenkins-infra/helpdesk/issues/new/choose
16 stars 9 forks source link

Add JDK21 agents (build) #4124

Open dduportal opened 4 weeks ago

dduportal commented 4 weeks ago

Goal: provide an agent template on each controller (where needed) allowing users to have a default JDK21 when running their pipelines, regardless of the JDK used for running the agent (see https://github.com/jenkins-infra/helpdesk/issues/4121 about agent JDK runtime)

Task list:

dduportal commented 3 weeks ago

Update: https://github.com/jenkins-infra/helpdesk/issues/4122 indicates that we need to find a solution for trusted.ci.jenkins.io:

dduportal commented 3 weeks ago

Update: #4122 indicates that we need to find a solution for trusted.ci.jenkins.io:

* We use Azure VM template which allows running an initialization script when starting the agent process but not setting custom environment variables for the agent process

  * Let's try setting up the JAVA_HOME to JDK21 on a test template => if it work, we can define `maven-xx` labels on trusted.ci to ensure it stays coherent with ci.jio.
  * Otherwise, we'll have to update the pipelines so they define their own env. var JAVA_HOME.

* Note: we'll have to do the same with EC2 in the future though (once we exhaust Azure credits and start using AWS credits)

Tested with success using Linux Azure VM agents:

agentWorkspace: "/home/jenkins/agent"
builtInImage: "Ubuntu 22.04 LTS"
credentialsId: "jenkinsvmagents-userpass"
diskType: "managed"
doNotUseMachineIfInitFails: true
encryptionAtHost: true
ephemeralOSDisk: true
executeInitScriptAsRoot: true
existingStorageAccountName: "cijenkinsioagentssub"
imageReference:
  galleryImageDefinition: "jenkins-agent-ubuntu-22.04-amd64"
  galleryImageVersion: "1.70.1"
  galleryName: "prod_packer_images"
  galleryResourceGroup: "prod-packer-images"
  gallerySubscriptionId: "<redacted>"
imageTopLevelType: "advanced"
initScript: |
  #!/bin/sh
  set -eux

  # Setup Datadog service
  (
    systemctl stop datadog-agent.service
    mkdir -p /var/log/datadog /etc/datadog-agent
    sed 's/api_key:.*/api_key: <redacted>/' /etc/datadog-agent/datadog.yaml.example > /etc/datadog-agent/datadog.yaml
    sed -i 's/# site:.*/site: datadoghq.com/' /etc/datadog-agent/datadog.yaml
    chown dd-agent:dd-agent /etc/datadog-agent/datadog.yaml
    chmod 640 /etc/datadog-agent/datadog.yaml
    chown dd-agent:dd-agent /var/log/datadog
    chmod 770 /var/log/datadog
    systemctl daemon-reload
    systemctl enable datadog-agent.service
    systemctl start datadog-agent.service
  ) 2>&1 | tee /var/log/agent-init-datadog.log
  # Setup Jenkins Agent Service
  (
    # Argument provided by the Azure-VM plugin
    export JENKINS_URL="^${1}" # Always ends with a '/'
    export AGENT_NAME="^${2}"
    export AGENT_SECRET="^${3}"

    export USER=jenkins
    export AGENT_WORKDIR='/home/jenkins/agent'
    export AGENT_JAR="^${AGENT_WORKDIR}/agent.jar"
    export AGENT_SECRETFILE="^${AGENT_WORKDIR}/agent-secret"
    export AGENT_URL="^${JENKINS_URL}computer/^${AGENT_NAME}/jenkins-agent.jnlp"
    export JENKINS_JAVA_OPTS='-XX:+PrintCommandLineFlags'
    export ARTIFACT_CACHING_PROXY_PROVIDER='azure'
    export JAVA_HOME='/opt/jdk-21'
    export JENKINS_JAVA_BIN='/opt/jdk-17/bin/java'

    mkdir -p "^${AGENT_WORKDIR}"
    chown "^${USER}:^${USER}" "^${AGENT_WORKDIR}"
    curl --silent --show-error --location --output "^${AGENT_JAR}" "^${JENKINS_URL}jnlpJars/agent.jar"
    touch "^${AGENT_SECRETFILE}"
    echo "^${AGENT_SECRET}" > "^${AGENT_SECRETFILE}"
    cat <<- EOF >/etc/systemd/system/jenkins-agent.service
    [Unit]
    Description=Jenkins Inbound Agent
    Wants=network.target
    After=network.target

    [Service]
    ExecStart=^${JENKINS_JAVA_BIN} ^${JENKINS_JAVA_OPTS} -jar ^${AGENT_JAR} -jnlpUrl ^${AGENT_URL} -secret @^${AGENT_SECRETFILE} -workDir ^${AGENT_WORKDIR}
    User=^${USER}
    WorkingDirectory=^${AGENT_WORKDIR}
    Restart=on-failure
    RestartSec=10
    Environment="JAVA_HOME=^${JAVA_HOME}"
    Environment="ARTIFACT_CACHING_PROXY_PROVIDER=^${ARTIFACT_CACHING_PROXY_PROVIDER}"
    Environment="PATH=^${JAVA_HOME}/bin:/home/jenkins/.asdf/shims:/home/jenkins/.asdf/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"

    [Install]
    WantedBy=multi-user.target
  EOF

    systemctl daemon-reload
    systemctl enable jenkins-agent
    systemctl start jenkins-agent || systemctl status jenkins-agent
  ) 2>&1 | tee /var/log/agent-init-jenkins.log

  # Remove jenkins user from sudoers
  rm -f /etc/sudoers.d/90-cloud-init-users
labels: "vm-maven-21"
launcher: "inbound"
licenseType: "Classic"
location: "East US 2"
maxVirtualMachinesLimit: 5
maximumDeploymentSize: 10
noOfParallelJobs: 1
osDiskSize: 150
osType: "Linux"
retentionStrategy: "azureVMCloudOnce"
storageAccountNameReferenceType: "existing"
storageAccountType: "Standard_LRS"
subnetName: "public-jenkins-sponsorship-vnet-ci_jenkins_io_agents"
templateDesc: "Test by DDU to have dynamically provisioned Ubuntu 22.04 LTS\
  \ machine using JDK21 for build tools"
templateName: "ubuntu-22-jdk21-ddu"
usageMode: NORMAL
usePrivateIP: true
virtualMachineSize: "Standard_D4ads_v5"
virtualNetworkName: "public-jenkins-sponsorship-vnet"
virtualNetworkResourceGroupName: "public-jenkins-sponsorship"
export DEFAULT_JDK=/opt/jdk-21
# 2000 is higher than the 1000 provided in template
update-alternatives --install /usr/bin/java java "/opt/jdk-${DEFAULT_JDK}/bin/java" 2000

# Last line: takes precedence over the already defined `JAVA_HOME`
echo "JAVA_HOME=/opt/jdk-${DEFAULT_JDK}" >> /etc/environment
dduportal commented 3 weeks ago

Update: #4122 indicates that we need to find a solution for trusted.ci.jenkins.io:

* We use Azure VM template which allows running an initialization script when starting the agent process but not setting custom environment variables for the agent process

  * Let's try setting up the JAVA_HOME to JDK21 on a test template => if it work, we can define `maven-xx` labels on trusted.ci to ensure it stays coherent with ci.jio.
  * Otherwise, we'll have to update the pipelines so they define their own env. var JAVA_HOME.

* Note: we'll have to do the same with EC2 in the future though (once we exhaust Azure credits and start using AWS credits)

Tested with success using Linux Azure VM agents:

=> we can start the work to define maven-17 and maven-21 labels for Linux Ubuntu VM templates on trusted.ci and cert.ci immediately (cc @smerle33 ) => With Windows VM, tried the inbound launcher without success: I'm missing how to update env. vars in the current working session (e.g. in the init script): the service launcher does not seem to pick my value (ping @MarkEWaite @timja have you already done this kind of thing?). Ref. init powershell code in https://github.com/jenkins-infra/jenkins-infra/blob/dc673c579eddc81c604638b53b2e6d1f506a0dd7/dist/profile/templates/jenkinscontroller/casc/clouds.yaml.erb#L24-L50 which starts the inbound service stuff

timja commented 3 weeks ago

Also a success with the SSH launcher

You can also simply set the javaPath variable in JCasC for SSH launcher

inbound launcher without success for Windows

Your code looks fine, I'd login to the agent and cat the file to see its been replaced appropriately and check the logs for the service (along with trying to start it manually if needed)

dduportal commented 3 weeks ago

Also a success with the SSH launcher

You can also simply set the javaPath variable in JCasC for SSH launcher

We already do this (ref. https://github.com/jenkins-infra/jenkins-infra/blob/dc673c579eddc81c604638b53b2e6d1f506a0dd7/dist/profile/templates/jenkinscontroller/casc/clouds.yaml.erb#L153). But the JCasC javaPath attribute is only used to specify a JDK binary for the agent runtime (will be tracked in https://github.com/jenkins-infra/helpdesk/issues/4121).

Here, we want the set the default java available for builds, which is set by 2 variables:

On Kubernetes plugin or EC2 plugin, we define the environment variables in the Jenkins Cloud config UI and the variables are applied to the agent.jar process when it starts. But with AzureVM, this feature does not exist as far as I can tell: we rely on the custom init script instead.

On Linux, we have solutions due to the init script using SystemD (inbound launcher) or update-alternative (Ubuntu) to solve the problem => no complicated env. loading path to manage.

But on Windows I'm not sure which solution to take: we usually set up the registry to have the env. vars setup which requires a reboot/restart/reload/reloggin for this change to be taken in account. As we tend to use packer and Docker, this problem do not exist (we build the image, and then when it is instanciated, the new env. is loaded). => in the case here, the init script of the AzureVM plugin does not re-log/reload so the new env. var values are not picked up by the agent.jar process.

timja commented 2 weeks ago

Have a look at this for Windows: https://github.com/winsw/winsw/blob/v3/docs/xml-config-file.md

dduportal commented 2 weeks ago

Have a look at this for Windows: https://github.com/winsw/winsw/blob/v3/docs/xml-config-file.md

Coool, that will do the trick!