tobato / FastDFS_Client

Java Client for FastDFS
GNU Lesser General Public License v3.0
1.48k stars 516 forks source link

java.net.SocketException: Connection reset 报错后报 java.net.SocketException: 断开的管道 (Write failed) #166

Open cosoc opened 4 years ago

cosoc commented 4 years ago

每隔一段时间大约一个小时会先报错

java.net.SocketException: Connection reset
    at java.net.SocketInputStream.read(SocketInputStream.java:210)
    at java.net.SocketInputStream.read(SocketInputStream.java:141)
    at java.net.SocketInputStream.read(SocketInputStream.java:127)
    at com.github.tobato.fastdfs.domain.conn.DefaultConnection.isValid(DefaultConnection.java:106)
    at com.github.tobato.fastdfs.domain.conn.PooledConnectionFactory.validateObject(PooledConnectionFactory.java:94)
    at com.github.tobato.fastdfs.domain.conn.PooledConnectionFactory.validateObject(PooledConnectionFactory.java:22)
    at org.apache.commons.pool2.impl.GenericKeyedObjectPool.evict(GenericKeyedObjectPool.java:943)
    at org.apache.commons.pool2.impl.BaseGenericObjectPool$Evictor.run(BaseGenericObjectPool.java:1138)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

然后报错

java.net.SocketException: 断开的管道 (Write failed)
    at java.net.SocketOutputStream.socketWrite0(Native Method)
    at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111)
    at java.net.SocketOutputStream.write(SocketOutputStream.java:143)
    at com.github.tobato.fastdfs.domain.conn.DefaultConnection.close(DefaultConnection.java:73)
    at com.github.tobato.fastdfs.domain.conn.PooledConnectionFactory.destroyObject(PooledConnectionFactory.java:89)
    at com.github.tobato.fastdfs.domain.conn.PooledConnectionFactory.destroyObject(PooledConnectionFactory.java:22)
    at org.apache.commons.pool2.impl.GenericKeyedObjectPool.destroy(GenericKeyedObjectPool.java:1081)
    at org.apache.commons.pool2.impl.GenericKeyedObjectPool.evict(GenericKeyedObjectPool.java:944)
    at org.apache.commons.pool2.impl.BaseGenericObjectPool$Evictor.run(BaseGenericObjectPool.java:1138)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
tobato commented 4 years ago

报错以后,应用能自己恢复吗?是否会影响后续应用使用?

网络是有可能故障的,应用能自愈就行 java.net.SocketException: Connection reset

cosoc commented 4 years ago

报错以后,应用能自己恢复吗?是否会影响后续应用使用?

网络是有可能故障的,应用能自愈就行 java.net.SocketException:连接重置

不可以,当使用过一次后,长时间不使用,过一段时间就会一直包这个错误!就现在也在有规律的报错! 如果启动什么都不做,我这样让他待上一天不会报错,但是一传文件或者其他操作后,就会这样报错.不频繁,就是大约1个小时的样子.但是如果传文件这些功能正常,就是这个问题不知道什么状况.我使用的springBoot 2.1.7版本

tobato commented 4 years ago

用的哪个版本?是否把日志打开观察过?

tobato commented 4 years ago

按照路径com.github.tobato.fastdfs.domain.conn.ConnectionManager 把源码复制到你的工程当中,

    protected Connection getConnection(InetSocketAddress address) {
        Connection conn = null;
        try {
            // 获取连接
            conn = pool.borrowObject(address);
            //dumpPoolInfo(address);   ----> 打开注释
        } catch (FdfsException e) {
            throw e;
        } catch (Exception e) {
            LOGGER.error("Unable to borrow buffer from pool", e);
            throw new RuntimeException("Unable to borrow buffer from pool", e);
        }
        return conn;
    }

打开注释,在故障的时候观察一下连接池的情况

BBCNKing commented 4 years ago

你好,我也发现了同样的问题,common pool的evit机制调用jar包中的isValid方法检查连接时,client 主动Reset导致

tobato commented 4 years ago

@BBCNKing 建议用上面代码观察一下连接池的情况。

client 主动释放无效连接,

at com.github.tobato.fastdfs.domain.conn.DefaultConnection.isValid(DefaultConnection.java:106)

return false,理论上应该能释放

wangwangxf commented 4 years ago

我也遇到这样的问题,请问有解决方案吗?

wangwangxf commented 4 years ago

你好,我也发现了同样的问题,common pool的evit机制调用jar包中的isValid方法检查连接时,client 主动Reset导致

请问你解决了吗?咋解决的呢?

cosoc commented 4 years ago

作者能解决一下这个问题么?一直刷这个错误.真烦.把你主分支的代码重新构架也一样错误.能不能解决一下.按照你说的把代码打开也没其他附带日志信息.

tobato commented 4 years ago

dumpPoolInfo 没打印出日志?

daniel1519 commented 4 years ago

我也遇到同样的问题,作者这个问题解决了吗?怎么解决的?

cosoc commented 4 years ago

@daniel1519 换1.26.7的包,新的不稳定,会莫名其妙的出问题,1.26.7的比较稳定

xlxomg commented 4 years ago

我也遇到这个问题。 打算换成1.26.7试试

xlxomg commented 4 years ago

1.26.7还是会有, 但貌似频率低了一些

happyBluebirds commented 4 years ago

1.27.2 同样遇到这个问题

xlxomg commented 4 years ago

图片 一天有1000多条...

cosoc commented 4 years ago

报错还是小问题,问题是一旦报错内存会无法回收,到一定时间段会内存溢出服务器over!我已经开始从它迁出!

xlxomg commented 4 years ago

=====================fastdfs代理设置======================

fdfs:

fastdfs连接池配置

pool:

每个key最大连接数

max-total-per-key: 100
#从池中借出的对象的最大数目(配置为-1表示不限制)
max-total: -1
#获取连接时的最大等待毫秒数(默认配置为5秒)
max-wait-millis: 5000
#每个key对应的连接池最大空闲连接数
max-idle-per-key: 15
#每个key对应的连接池最小空闲连接数
min-idle-per-key: 10
#在空闲时检查有效性, 默认false
test-while-idle: true
#对于“空闲链接”检测线程而言,每次检测的链接资源的个数,默认3  (-1表示清理时检查所有线程)
num_tests_pereviction_run: -1
#连接空闲的最小时间,达到此值后空闲连接将可能会被移除。负值(-1)表示不移除   60 * 5 * 1000 = 300000
min_evictable_idle_time_millis:  300000
#逐出扫描的时间间隔(毫秒) 每过30秒进行一次后台对象清理的行动
time_between_eviction_runs_millis: 30000

该了连接池配置后 好多了。 观察了10多天 只有1天有报一点错

zippo-zu commented 4 years ago

加一个配置解决这个问题 fdfs.pool.test-on-borrow=true 因为这个问题我研读了源码,终于完美修复解决这个问题。

ighack commented 1 year ago

我遇到这个问题是由于某个节点的磁盘空间满了(正解的空间没满,担系统提示满了,无法写文件了)。而且上传文件不报错。但返回的路径是null 。我把报错的节点停了就好了