Closed timja closed 7 years ago
Please provide Jenkins system log for the timestamp of the issue +/- 5 minutes
Failing code is here: https://github.com/jenkinsci/jenkins/blob/71cbe0cc7c601c04509faa618b23194335288fee/core/src/main/java/jenkins/slaves/DefaultJnlpSlaveReceiver.java#L50-L53
From what I see the code does not process ComputerLauncherFilter or DelegatingComputerLauncher correctly, and it causes failures of plugins like Slave Setup Plugin (https://github.com/jenkinsci/slave-setup-plugin/blob/ba1a93e0d1a4a150c1cd1cda87a29930a2d60773/src/main/java/org/jenkinsci/plugins/slave_setup/SetupSlaveLauncher.java#L23)
odklizec, could you please provide configuration of you slave/agent? Just to confirm the theory
Hi,
Thanks for the feedback. Here is the system log...
Jenkins2.27_SystemLog.txt
The slave is started (on VM start) via batch command like this:
call "C:\Program Files (x86)\Java\jre1.8.0_111\bin\java.exe" -Xrs -jar "slave.jar" -jnlpUrl http://cpjen01:8090/computer/CPRAN03/slave-agent.jnlp -secret secretkey
If you need Slave configuration from Jenkins GUI, here it is...
Thanks!
BTW, I'm not using Slave Setup Plugin in my Jenkins configuration. Just Windows Slaves Plugin.
Well, it's even more funny.
vSphereLauncher inherits ComputerLauncher directly: https://github.com/jenkinsci/vsphere-cloud-plugin/blob/dd3f70f41b06ac2ab3e02996f924a73f15850911/src/main/java/org/jenkinsci/plugins/vSphereCloudLauncher.java#L31 , hence it is not a JnlpLauncher itself. But it includes JNLPLauncher as a secondary connection, about which the core code does not care.
My proposal is to weaken the JnlpLauncher instance check and to just perform logging on the FINE level instead of returning error. stephenconnolly, WDYT?
Core should walk the DelegatingComputerLauncher instances to see if one of them is a JNLP one. The vSphereCloudLauncher should extend DelegatingComputerLauncher.
Code changed in jenkins
User: Stephen Connolly
Path:
src/main/java/org/jenkinsci/plugins/vSphereCloudLauncher.java
src/main/java/org/jenkinsci/plugins/vSphereCloudSlave.java
http://jenkins-ci.org/commit/vsphere-cloud-plugin/5b4ff26a787e3d2a68963dfa7f5f254733488061
Log:
JENKINS-39232 Make vSphereCloudLauncher inherit from DelegatingComputerLauncher
Code changed in jenkins
User: Jason Swager
Path:
src/main/java/org/jenkinsci/plugins/vSphereCloudLauncher.java
src/main/java/org/jenkinsci/plugins/vSphereCloudSlave.java
http://jenkins-ci.org/commit/vsphere-cloud-plugin/5ed468acb831679ab0e07f16eef77972c7fc6c65
Log:
Merge pull request #59 from stephenc/jenkins-39232
JENKINS-39232 Make vSphereCloudLauncher inherit from DelegatingComputerLauncher
Compare: https://github.com/jenkinsci/vsphere-cloud-plugin/compare/dd3f70f41b06...5ed468acb831
Regarding the fix to the core, I would expect it to be released by Sunday. We commonly have weekly releases on Sundays. Kohsuke is pretty busy on this week, so I'm not sure we can get an out-of-order release
Thanks for your effort guys! Do I understand it right that the fix should make the master/slave communication working, even without immediate update of affected plugins? Thanks!
For most plugins, yes... and for those plugins that are still not working, you can turn on a system property to disable the strict checking until those plugins are fixed... that is if #2602 gets chosen for merge
Code changed in jenkins
User: Stephen Connolly
Path:
core/src/main/java/hudson/slaves/DelegatingComputerLauncher.java
core/src/main/java/jenkins/slaves/DefaultJnlpSlaveReceiver.java
http://jenkins-ci.org/commit/jenkins/1d16ce024c9049c1f05b45b48d5f8438e9303f4e
Log:
JENKINS-39232 Walk the DelegatingComputerLauncher instances when checking if JNLPComputerLauncher (#2602)
The fix has been merged towards 2.28
In this fix you may have to set the jenkins.slaves.DefaultJnlpSlaveReceiver.disableStrictVerification System Property to true in order to make the instance working (depend). Currently the fix with non-aggressive behavior is under consideration.
Hi guys,
It seems someone forgot to upload 2.28 war file? I just tried to download 2.28 from the main Jenkins page (where it says 2.28 is available) but the downloaded file seems to be 2.27 instead! And if I try to download/install 2.28 from Jenkins Configuration page, it throws an error "_Failed to download from http://updates.jenkins-ci.org/download/war/2.28/jenkins.war (redirected to: http://mirrors.jenkins-ci.org/war/2.28/jenkins.war)_".
Thank you in advance for fixing this!
Confirmed the issue. Likely it's a broken mirror or something weird
Created INFRA-962
I've tested this with Jenkins 2.28 but I still see the same error:
JNLP agent connected from /192.168.12.3 <===[JENKINS REMOTING CAPACITY]===>Slave.jar version: 3.0 This is a Windows agent java.net.MalformedURLException: no protocol: jnlpJars/slave.jar
I am running the master with -Djenkins.slaves.DefaultJnlpSlaveReceiver.disableStrictVerification=true
Commenting here because JENKINS-39246 was marked as a duplicate.
sonneveldsmartward Please provide info about your agent configuration and system logs from the master
I am still seeing the exact same behavior after upgrading to 2.28 on the master.
the launch button kicks off the command, and then terminates.
java.io.IOException: Connection aborted: org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport@6698473d[name=Channel to /
at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:208)
at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:629)
at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:197)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at org.jenkinsci.remoting.nio.FifoBuffer$Pointer.receive(FifoBuffer.java:137)
at org.jenkinsci.remoting.nio.FifoBuffer.receive(FifoBuffer.java:310)
at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:561)
... 6 more
in java console window:
Java Web Start 11.111.2.14 x86
Using JRE version 1.8.0_111-b14 Java HotSpot(TM) Client VM
----------------------------------------------------
Insecure property: (hudson.showWindowsServiceInstallLink, true) specified in unsigned jnlp file will not be set.
Nov 01, 2016 2:10:20 PM hudson.remoting.jnlp.Main createEngine
INFO: Setting up slave: WIN-16I6UR90SEE
Nov 01, 2016 2:10:20 PM com.youdevise.hudson.slavestatus.SlaveListener call
INFO: Slave-status listener starting
Nov 01, 2016 2:10:20 PM com.youdevise.hudson.slavestatus.SocketHTTPListener waitForConnection
INFO: Slave-status listener ready on port 3141
Nov 01, 2016 2:10:31 PM com.youdevise.hudson.slavestatus.SlaveListener call
INFO: Slave-status listener starting
Nov 01, 2016 2:10:31 PM com.youdevise.hudson.slavestatus.SlaveListener$1 run
SEVERE: Could not listen on port
java.net.BindException: Address already in use: JVM_Bind
at java.net.DualStackPlainSocketImpl.bind0(Native Method)
at java.net.DualStackPlainSocketImpl.socketBind(Unknown Source)
at java.net.AbstractPlainSocketImpl.bind(Unknown Source)
at java.net.PlainSocketImpl.bind(Unknown Source)
at java.net.ServerSocket.bind(Unknown Source)
at java.net.ServerSocket.
at java.net.ServerSocket.
at com.youdevise.hudson.slavestatus.SocketHTTPListener.waitForConnection(SlaveListener.java:129)
at com.youdevise.hudson.slavestatus.SlaveListener$1.run(SlaveListener.java:63)
at com.youdevise.hudson.slavestatus.Daemon.go(Daemon.java:16)
at com.youdevise.hudson.slavestatus.SlaveListener.call(SlaveListener.java:83)
at hudson.remoting.UserRequest.perform(UserRequest.java:153)
at hudson.remoting.UserRequest.perform(UserRequest.java:50)
at hudson.remoting.Request$2.run(Request.java:332)
at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at hudson.remoting.Engine$1$1.run(Engine.java:94)
at java.lang.Thread.run(Unknown Source)
in order to set that system property, I would be starting the slave via command line, right? But that works now without this property setting.
java -jar slave.jar -jnlpUrl http://SERVER/computer/WIN-16I6UR90SEE/slave-agent.jnlp
works fine. I would prefer to use the web launch agent from browser, though.
> in order to set that system property, I would be starting the slave via command line, right?
The property must be set on Master, not on Jenkins agents
Okay, so I set this on the master, but the slave still won't start.
here's the process: jenkins 5044 1 16 11:02 ?00:00:47 /etc/alternatives/java -Dcom.sun.akuma.Daemon=daemonized -Djenkins.slaves.DefaultJnlpSlaveReceiver.disableStrictVerification=true -DJENKINS_HOME=/efs/Prod-Jenkins1-Test/jenkins -jar /usr/lib/jenkins/jenkins.war --logfile=/var/log/jenkins/jenkins.log --webroot=/var/cache/jenkins/war --daemon --httpPort=8080 --debug=5 --handlerCountMax=100 --handlerCountMaxIdle=20 --accessLoggerClassName=winstone.accesslog.SimpleAccessLogger --simpleAccessLogger.format=combined --simpleAccessLogger.file=/var/log/jenkins/access_log
Insecure property: (hudson.showWindowsServiceInstallLink, true) specified in unsigned jnlp file will not be set.
Nov 01, 2016 3:05:54 PM hudson.remoting.jnlp.Main createEngine
INFO: Setting up slave: WIN-16I6UR90SEE
Nov 01, 2016 3:05:55 PM com.youdevise.hudson.slavestatus.SlaveListener call
INFO: Slave-status listener starting
Nov 01, 2016 3:05:55 PM com.youdevise.hudson.slavestatus.SocketHTTPListener waitForConnection
INFO: Slave-status listener ready on port 3141
Nov 01, 2016 3:06:05 PM com.youdevise.hudson.slavestatus.SlaveListener call
INFO: Slave-status listener starting
Nov 01, 2016 3:06:05 PM com.youdevise.hudson.slavestatus.SlaveListener$1 run
SEVERE: Could not listen on port
java.net.BindException: Address already in use: JVM_Bind
at java.net.DualStackPlainSocketImpl.bind0(Native Method)
at java.net.DualStackPlainSocketImpl.socketBind(Unknown Source)
at java.net.AbstractPlainSocketImpl.bind(Unknown Source)
at java.net.PlainSocketImpl.bind(Unknown Source)
at java.net.ServerSocket.bind(Unknown Source)
at java.net.ServerSocket.
at java.net.ServerSocket.
at com.youdevise.hudson.slavestatus.SocketHTTPListener.waitForConnection(SlaveListener.java:129)
at com.youdevise.hudson.slavestatus.SlaveListener$1.run(SlaveListener.java:63)
at com.youdevise.hudson.slavestatus.Daemon.go(Daemon.java:16)
at com.youdevise.hudson.slavestatus.SlaveListener.call(SlaveListener.java:83)
at hudson.remoting.UserRequest.perform(UserRequest.java:153)
at hudson.remoting.UserRequest.perform(UserRequest.java:50)
at hudson.remoting.Request$2.run(Request.java:332)
at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at hudson.remoting.Engine$1$1.run(Engine.java:94)
at java.lang.Thread.run(Unknown Source)
" Address already in use" likely means that you have a conflicting process utilizing the service
that happens because the launcher is running, but the real error happens first, Insecure property: (hudson.showWindowsServiceInstallLink, true) specified in unsigned jnlp file will not be set.
Not sure if this was clear from my other comments:
this command line works: java -jar slave.jar -jnlpUrl http://10.240.2.76:8080/computer/WIN-16I6UR90SEE/slave-agent.jnlp
I have a copy of slave-agent.jnlp in my c:\Jenkins folder.
this command line does NOT work: javaws http://10.240.2.76:8080/computer/WIN-16I6UR90SEE/slave-agent.jnlp
I can confirm that changing our slave startup script from
javaws http://jenkins.internal:8080/computer/win-app01.bld/slave-agent.jnlp
to
java -jar slave.jar -jnlpUrl http://jenkins.internal:8080/computer/win-app01.bld/slave-agent.jnlp
seems to allow our slaves to connect now.
I'll check the certificate settings ASAP. Most likely the signing certificate is expired. If yes, I will create another issue for it
any updates on this? we are not rolling out 2.X until we get this working. thanks
The out-of-order release discussion is here: https://groups.google.com/forum/#!topic/jenkinsci-dev/53zsNTktxYg
oleg_nenashev: Will the bugfix be in the next regular version 2.29 (may be next week)? In the changelog for Release2.28 and the list of upcoming changes there is nothing said about this bug or bugfix.
afischer There is a changelog in 2.28: "implementing other proxying and filtering Launcher implementations. Particular plugins may require setting up the jenkins.slaves.DefaultJnlpSlaveReceiver.disableStrictVerification system property in the master JVM to allow connecting agents."
Are you looking for this fix? Or are you aware about the certificate?
You are right, I have overread the detail-comments about the bug. So I will try the new version 2.28.
I still have the same issue that I reported in https://issues.jenkins-ci.org/browse/JENKINS-39252.
I click on Launch for the windows slave, and it starts than teminates.
the java console shows
you mentioned that this is a problem with the certificate, but I don't see any fix for this. I have ugraded my master to the latest jenkins 2.30
So there is a "Insecure property: (hudson.showWindowsServiceInstallLink, true) specified in unsigned jnlp file will not be set". The JNLP file is expected to be signed, likely there is another issue. Please report it in a separate ticket and assign to me
The issue for JNLP certificates has been created: JENKINS-39596. Closing this issue
I have my own class that i extend for SSH and JNLP launcher implementation. Core isn't backward compatible in terms of 2.X version. If you want to do something breakable please schedule it for 3.0 jenkins.
https://github.com/jenkinsci/jenkins/pull/2601 was stuck for months without feedback. Closed it
Code changed in jenkins
User: Nicolas De Loof
Path:
docker-plugin/src/main/java/io/jenkins/docker/connector/DockerComputerJNLPConnector.java
http://jenkins-ci.org/commit/docker-plugin/b574ba2a30dfcdd0f3cecbba4065d456ef4e3fa2
Log:
with JENKINS-39232 we need to referene a JNLPLauncher (#543)
Hi,
I'm unable to start/connect slave machine, using JNLP, after installing version 2.27 on master + new 3.0 slave.jar on slave computer. Installed Java 1.8.111.
I've enabled Java Web Start agent 3 and 4 in Configure Global Security, but to no avail. All I'm getting is an error "Local headers refused by remote: CPRAN03 is not a JNLP agent" followed by number of exceptions and even version 3 and 2 of JNLP agent fails to start. Is there something else I need to configure in latest Jenkins 2.27?
Also, when I try to run the slave-agent.jnlp (via http://cpjen01:8090/computer/CPRAN03/slave-agent.jnlp) it fails with error as shown on attached image:
Originally reported by odklizec, imported from: unable to start slave after installing version 2.27 on master + new 3.0 slave.jar on slave