jenkinsci / google-compute-engine-plugin

https://plugins.jenkins.io/google-compute-engine/
Apache License 2.0
57 stars 85 forks source link

Option to limit run time of VM #408

Open jglick opened 1 year ago

jglick commented 1 year ago

What feature do you want to see added?

CleanLostNodesWork tries to terminate VMs that no longer seem to be associated with any known Jenkins agent, but this still relies on the Jenkins controller to perform cleanup. https://github.com/jenkinsci/google-compute-engine-plugin/blob/0157d4ff4dba0e19e7f4c04a1440356ff4adc0ab/src/main/java/com/google/jenkins/plugins/computeengine/ComputeEngineCloud.java#L176-L177 notes that this may not be foolproof; in particular, if you are using the configuration-as-code plugin to define a list of clouds, you may well neglect to set instanceId, in which case the id would be randomly reset every time JCasC was reapplied.

It would be useful to have an option to set .scheduling.maxRunDuration to ensure that an instance is automatically terminated if it has been accidentally left running for an unreasonably long time.

Even better, set .scheduling.terminationTime to a day in the future when the instance is created, and then replace CleanLostNodesWork with an hourly task to look for all connected GCE agents and update their termination time to be a day from then. This would guarantee that any abandoned VM will be terminated in a timely fashion no matter what happens to Jenkins, without imposing a particular time limit on how long a VM can legitimately be used as an agent if you happen to have an especially long build.

Upstream changes

I checked https://github.com/jenkinsci/google-compute-engine-plugin/blob/0157d4ff4dba0e19e7f4c04a1440356ff4adc0ab/src/main/java/com/google/jenkins/plugins/computeengine/InstanceConfiguration.java#L497-L501 but the current client library seems to lack support for this system. Perhaps it is too old? There seem to be multiple Java libraries that have been published by Google and it is not clear which one is preferred and supported. Recommend updating the client library version here and in dependencies such as google-oauth-plugin.

jglick commented 5 months ago

See https://github.com/jenkinsci/kubernetes-plugin/pull/1543 for an example of a similar feature.