Open timja opened 5 years ago
This issue seems to be caused by https://github.com/jenkinsci/azure-vm-agents-plugin/blob/d83f4153627fc6f64c166217bac59e27876ee984/src/main/java/com/microsoft/azure/vmagent/remote/AzureVMAgentSSHLauncher.java#L250 . This method seems to be hanged forever and the below logs do not be triggered. Need more investigation on this because this method will call Jenkins core code.
by jieshe
Hi Jie, do you have an update?
by bel_sander
My assumption here seems to be wrong. This issue seems not be hung on the setChannel method. Any help or suggestion here will be welcome.
by jieshe
Jie Shen I have the issue you describe.
We have agents configured to shutdown after some idle time. When agent need to be reused, it come back online but never goes out suspended state. It appears from thread dump that it's blocked in https://github.com/jenkinsci/azure-vm-agents-plugin/blob/d83f4153627fc6f64c166217bac59e27876ee984/src/main/java/com/microsoft/azure/vmagent/remote/AzureVMAgentSSHLauncher.java#L250 calling https://github.com/jenkinsci/branch-api-plugin/blob/branch-api-2.5.4/src/main/java/jenkins/branch/WorkspaceLocatorImpl.java#L534:
Thread 31073: (state = BLOCKED) - jenkins.branch.WorkspaceLocatorImpl$Collector.onOnline(hudson.model.Computer, hudson.model.TaskListener) @bci=27, line=534 (Interpreted frame) - hudson.slaves.SlaveComputer.setChannel(hudson.remoting.Channel, java.io.OutputStream, hudson.remoting.Channel$Listener) @bci=662, line=698 (Interpreted frame) - hudson.slaves.SlaveComputer.setChannel(java.io.InputStream, java.io.OutputStream, java.io.OutputStream, hudson.remoting.Channel$Listener) @bci=82, line=432 (Interpreted frame) - com.microsoft.azure.vmagent.remote.AzureVMAgentSSHLauncher.launch(hudson.slaves.SlaveComputer, hudson.model.TaskListener) @bci=977, line=250 (Interpreted frame) - hudson.slaves.SlaveComputer$1.call() @bci=88, line=294 (Interpreted frame) - jenkins.util.ContextResettingExecutorService$2.call() @bci=18, line=46 (Compiled frame) - jenkins.security.ImpersonatingExecutorService$2.call() @bci=17, line=71 (Interpreted frame) - java.util.concurrent.FutureTask.run() @bci=42, line=266 (Compiled frame) - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95, line=1149 (Interpreted frame) - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=624 (Interpreted frame) - java.lang.Thread.run() @bci=11, line=748 (Interpreted frame)
by michaelburtin
When using idle retention strategy and setting shutting down only option, sometimes the agents may not be able to connect successfully. And it will stay as suspended all the time.
Some useful log from @Sander Bel ,
[JENKINS-57447] created by jieshe