priestjim / chef-rundeck

A Chef cookbook for the remote administration tool Rundeck
Other
28 stars 46 forks source link

Set global JVM Socket Timeout for long-running tasks that rundeck can't seem to interrupt. #33

Open noahlz opened 5 years ago

noahlz commented 5 years ago

Our rundeck is hanging on jobs for days due to socketRead0 see below.

I'm going to add the following to the rundeck profile erb in order to attempt to kill such hanging processes after say, 12 hours (it will be an attribute).

sun.net.client.defaultReadTimeout

https://docs.oracle.com/javase/8/docs/technotes/guides/net/properties.html


"WinRM output reader for command [2B856BD0-D831-4BB9-A154-184286605B6A]" daemon prio=10 tid=0x00007f5d4477c800 nid=0x1267 runnable [0x00007f5d7996d000
]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:152)
        at java.net.SocketInputStream.read(SocketInputStream.java:122)
        at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:166)
        at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:90)
        at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:281)
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:92)
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62)
        at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254)
        at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289)
        at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252)
        at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191)
        at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300)
        at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127)
        at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:712)
        at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:517)
        at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
        at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
        at com.xebialabs.overthere.cifs.winrm.WinRmClient.doSendRequest(WinRmClient.java:421)
        at com.xebialabs.overthere.cifs.winrm.WinRmClient.access$100(WinRmClient.java:102)
        at com.xebialabs.overthere.cifs.winrm.WinRmClient$PrivilegedSendMessage.run(WinRmClient.java:393)
        at com.xebialabs.overthere.cifs.winrm.WinRmClient$PrivilegedSendMessage.run(WinRmClient.java:382)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at com.xebialabs.overthere.cifs.winrm.WinRmClient.runPrivileged(WinRmClient.java:368)
        at com.xebialabs.overthere.cifs.winrm.WinRmClient.sendRequest(WinRmClient.java:352)
        at com.xebialabs.overthere.cifs.winrm.WinRmClient.receiveOutput(WinRmClient.java:191)
        at com.xebialabs.overthere.cifs.winrm.CifsWinRmConnection$2.run(CifsWinRmConnection.java:156)```
noahlz commented 5 years ago

Incidentally, the better approach is that WinRmClient supports a configuration option for socket timeouts, that we can configure in rundeck. I tried adding

        winrm-socket-timeout="43200000"

to the project etc/resource.xml as a shot in the dark - didn't work (obviously).

https://github.com/xebialabs/overthere#cifs_troubleshooting