Open manishbansal8843 opened 3 years ago
Pasting the problem here also for clarity.
We have a java 7 code base where we are using Apache commons vfs2 v2.2 which is using JSch-0.1.54 as sftp provider.
Now, the use case is to transfer files to a remote host via sftp. But, every now and then, the file upload process gets stuck. After taking the thread dump of the application, we found both the threads (t1, which sends the data to remote sftp & t2, which receives data from sftp) are in wait state forever. Below is the thread dump snapshot.
JSch session thread:
"Connect thread remote.ftp.com session" daemon prio=10 tid=0x00007f99cc243000 nid=0x144 in Object.wait() [0x00007f9985606000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.io.PipedInputStream.awaitSpace(PipedInputStream.java:273)
at java.io.PipedInputStream.receive(PipedInputStream.java:231)
- locked <0x000000043eda02d8> (a com.jcraft.jsch.Channel$MyPipedInputStream)
at java.io.PipedOutputStream.write(PipedOutputStream.java:149)
at com.jcraft.jsch.IO.put(IO.java:64)
at com.jcraft.jsch.Channel.write(Channel.java:438)
at com.jcraft.jsch.Session.run(Session.java:1459)
at java.lang.Thread.run(Thread.java:748)
Application thread used to upload file data.
"akka.actor.default-dispatcher-19" prio=10 tid=0x00007f99d4012000 nid=0xea in Object.wait() [0x00007f9988785000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at com.jcraft.jsch.Session.write(Session.java:1269)
- locked <0x0000000440a61b48> (a com.jcraft.jsch.ChannelSftp)
at com.jcraft.jsch.ChannelSftp.sendWRITE(ChannelSftp.java:2646)
at com.jcraft.jsch.ChannelSftp.access$100(ChannelSftp.java:36)
at com.jcraft.jsch.ChannelSftp$1.write(ChannelSftp.java:806)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
- locked <0x0000000440aab240> (a org.apache.commons.vfs2.provider.sftp.SftpFileObject$SftpOutputStream)
at org.apache.commons.vfs2.util.MonitorOutputStream.write(MonitorOutputStream.java:104)
- locked <0x0000000440aab240> (a org.apache.commons.vfs2.provider.sftp.SftpFileObject$SftpOutputStream)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
- locked <0x0000000440aac278> (a org.apache.commons.vfs2.provider.DefaultFileContent$FileContentOutputStream)
at org.apache.commons.vfs2.util.MonitorOutputStream.write(MonitorOutputStream.java:104)
- locked <0x0000000440aac278> (a org.apache.commons.vfs2.provider.DefaultFileContent$FileContentOutputStream)
at org.apache.commons.vfs2.provider.DefaultFileContent.write(DefaultFileContent.java:741)
at org.apache.commons.vfs2.provider.DefaultFileContent.write(DefaultFileContent.java:720)
at org.apache.commons.vfs2.provider.DefaultFileContent.write(DefaultFileContent.java:691)
at org.apache.commons.vfs2.provider.DefaultFileContent.write(DefaultFileContent.java:707)
at org.apache.commons.vfs2.FileUtil.copyContent(FileUtil.java:78)
at org.apache.commons.vfs2.provider.AbstractFileObject.copyFrom(AbstractFileObject.java:289)
After looking at the codebase of Jsch library, Here is what i feel is happening.
Further, i have checked the linux machine on which this app is running, RecQ of the socket is kept on building up. Which means, socket is still alive and remote server is keep on sending 32KB packets every now and then.
sudo netstat -anpt | grep 19321
tcp6 0 0 10.14.233.97:59594 64.233.167.130:19321 TIME_WAIT -
tcp6 58256 0 10.14.233.97:58214 64.233.167.130:19321 ESTABLISHED 460144/java
tcp6 499888 0 10.14.233.97:58422 64.233.167.130:19321 ESTABLISHED 460144/java
tcp6 0 0 10.14.233.97:59622 64.233.167.130:19321 ESTABLISHED 460144/java
tcp6 0 0 10.14.233.97:59608 64.233.167.130:19321 TIME_WAIT -
tcp6 74672 0 10.14.233.97:56656 64.233.167.130:19321 ESTABLISHED 460144/java
tcp6 92688 0 10.14.233.97:56842 64.233.167.130:19321 ESTABLISHED 460144/java
Problem is described on here on stackoverflow.
We have deployed some fix in our production which seems to be working fine. Will raise PR here so that it can be verified by experts and help others in need. Thanks.