adoptium / infrastructure

This repo contains all information about machine maintenance.
Apache License 2.0
87 stars 102 forks source link

Feature: use a node pool (labeles) for job "Cleanup-Nodes" #2823

Open zdtsw opened 2 years ago

zdtsw commented 2 years ago

https://ci.adoptopenjdk.net/job/Cleanup-Nodes/ is set to use test-packet-ubuntu2004-x64-2

current this node is offline , look back from history it has been offline at least back from 3rd Nov. https://ci.adoptopenjdk.net/job/Cleanup-Nodes/56/console So, we have no cleanup for at least 2 weeks on these nodes.

Agent test-packet-ubuntu2004-x64-2 (4 CPUS | 8 GB RAM | 80 Gb SSD | hosted by packet.net)
This agent is offline because Jenkins failed to launch the agent process on it. [See log for more details](https://ci.adoptopenjdk.net/computer/test-packet-ubuntu2004-x64-2/log)

P.S: source code of this job is from https://github.com/eclipse-openj9/openj9/blob/master/buildenv/jenkins/jobs/infrastructure/Cleanup-Nodes.groovy Do we have a forked version in our repo? or is it safe to keep using "master" branch?

smlambert commented 2 years ago

We can use a tagged version to buffer us from new changes if we want, but I believe the OpenJ9 project uses the one from the main branch daily. There are updates being discussed to update this script further to help clear out the /tmp directories, as many openjdk tests now are leaving 'detritus' in that directory (which is outside of the workspace and therefore is not getting cleaned up). Related: https://github.com/adoptium/infrastructure/issues/2369 and https://github.com/eclipse-openj9/openj9/pull/16333/

We can change the SETUP_LABEL=ci.role.test&&sw.os.linux and use a tagged version of this script (recent tag openj9-0.35.0 is likely suitable, if we are to pin it).