alibaba / canal

阿里巴巴 MySQL binlog 增量订阅&消费组件
Apache License 2.0
28.4k stars 7.59k forks source link

channal经常关闭 #640

Closed fenggaopan closed 6 years ago

fenggaopan commented 6 years ago

你好,我这边运行一段时间后,几小时吧,confi下面配置了9个 instance,然后有些instance下面就报下面的日志了,binlog文件变更也拿不到了,经常遇到这种问题,代码里面,是一直获取数据的状态,没哟关闭channal的操作。

2018-05-16 16:07:37.638 [New I/O server worker #1-7] ERROR c.a.otter.canal.server.netty.handler.SessionHandler - something goes wrong with channel:[id: 0x58f255d8, /192.168.47.232:59420 :> /192.16 8.47.243:11111], exception=java.nio.channels.ClosedChannelException at org.jboss.netty.channel.socket.nio.NioWorker.cleanUpWriteBuffer(NioWorker.java:643) at org.jboss.netty.channel.socket.nio.NioWorker.writeFromUserCode(NioWorker.java:370) at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.handleAcceptedSocket(NioServerSocketPipelineSink.java:137) at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.eventSunk(NioServerSocketPipelineSink.java:76) at org.jboss.netty.channel.Channels.write(Channels.java:611) at org.jboss.netty.channel.Channels.write(Channels.java:578) at com.alibaba.otter.canal.server.netty.NettyUtils.write(NettyUtils.java:28) at com.alibaba.otter.canal.server.netty.handler.SessionHandler.messageReceived(SessionHandler.java:144) at org.jboss.netty.handler.timeout.IdleStateAwareChannelHandler.handleUpstream(IdleStateAwareChannelHandler.java:48) at org.jboss.netty.handler.timeout.IdleStateHandler.messageReceived(IdleStateHandler.java:275) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:302) at org.jboss.netty.handler.codec.replay.ReplayingDecoder.unfoldAndFireMessageReceived(ReplayingDecoder.java:525) at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:506) at org.jboss.netty.handler.codec.replay.ReplayingDecoder.cleanup(ReplayingDecoder.java:541) at org.jboss.netty.handler.codec.replay.ReplayingDecoder.channelDisconnected(ReplayingDecoder.java:449) at org.jboss.netty.channel.Channels.fireChannelDisconnected(Channels.java:360) at org.jboss.netty.channel.socket.nio.NioWorker.close(NioWorker.java:593) at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.handleAcceptedSocket(NioServerSocketPipelineSink.java:119) at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.eventSunk(NioServerSocketPipelineSink.java:76) at org.jboss.netty.channel.Channels.close(Channels.java:720) at org.jboss.netty.channel.AbstractChannel.close(AbstractChannel.java:200) at org.jboss.netty.channel.ChannelFutureListener$1.operationComplete(ChannelFutureListener.java:46) at org.jboss.netty.channel.DefaultChannelFuture.notifyListener(DefaultChannelFuture.java:381) at org.jboss.netty.channel.DefaultChannelFuture.addListener(DefaultChannelFuture.java:148) at com.alibaba.otter.canal.server.netty.NettyUtils.write(NettyUtils.java:30) at com.alibaba.otter.canal.server.netty.NettyUtils.error(NettyUtils.java:51) at com.alibaba.otter.canal.server.netty.handler.SessionHandler.messageReceived(SessionHandler.java:200) at org.jboss.netty.handler.timeout.IdleStateAwareChannelHandler.handleUpstream(IdleStateAwareChannelHandler.java:48) at org.jboss.netty.handler.timeout.IdleStateHandler.messageReceived(IdleStateHandler.java:275) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:302) at org.jboss.netty.handler.codec.replay.ReplayingDecoder.unfoldAndFireMessageReceived(ReplayingDecoder.java:525) at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:506) at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:443) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:274) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:261) at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:349) at org.jboss.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:280) at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:200) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722)

jingshenbusi6530 commented 6 years ago

我这也出现了类型你这种 something goes wrong with channel错误,我的是只要客户端一直连着服务器就没事,我测试过客户端跑了一天一夜是没有断过的,但是客户端一旦不连接canal服务端,超过一定时间(具体多久不清楚),再次连接服务端时,就会出现下面的这种错误。 2018-05-15 15:13:52.219 [New I/O server worker #1-2] ERROR c.a.otter.canal.server.netty.handler.SessionHandler - something goes wrong with channel:[id: 0x24e1e7e8, /192.168.2.128:55307

=> /192.168.1.38:11111

], exception=java.io .IOException: Connection reset by peer at sun.nio.ch .FileDispatcherImpl.read0(Native Method) at sun.nio.ch .SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch .IOUtil.readIntoNativeBuffer(IOUtil.java:223) at sun.nio.ch .IOUtil.read(IOUtil.java:192) at sun.nio.ch .SocketChannelImpl.read(SocketChannelImpl.java:379) at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:322) at org.jboss.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:281) at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:201) at org.jboss.netty.util.internal.IoWorkerRunnable.run(IoWorkerRunnable.java:46) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)

withlin commented 6 years ago

请说明下canal的版本。

jingshenbusi6530 commented 6 years ago

谢谢回答!!我用的是1.0.25. @DevWithLin

withlin commented 6 years ago

你换成1.0.26试试。

jingshenbusi6530 commented 6 years ago

能说明下是什么原因导致的嘛?? @DevWithLin

agapple commented 6 years ago

25版本使用了nio做binlog读取,不太稳定,不建议使用了

nbqyqx commented 6 years ago

Netty有idle超时close机制,客户端在创建connector时可以设置,如果不设置服务端默认5分钟主动断开channel,再次获取数据就会报错,需要重连connector。 com.alibaba.otter.canal.server.netty.handler.ClientAuthenticationHandler IdleStateHandler idleStateHandler = new IdleStateHandler(NettyUtils.hashedWheelTimer, readTimeout, writeTimeout, 0, TimeUnit.MILLISECONDS); ctx.getPipeline().addBefore(SessionHandler.class.getName(), IdleStateHandler.class.getName(), idleStateHandler);

IdleStateAwareChannelHandler idleStateAwareChannelHandler = new IdleStateAwareChannelHandler() {

public void channelIdle(ChannelHandlerContext ctx, IdleStateEvent e) throws Exception {
    logger.warn("channel:{} idle timeout exceeds, close channel to save server resources...",
        ctx.getChannel());
    ctx.getChannel().close();
}

}; ctx.getPipeline().addBefore(SessionHandler.class.getName(), IdleStateAwareChannelHandler.class.getName(), idleStateAwareChannelHandler);

mjjian0 commented 6 years ago

@agapple 26版本。会不定时重启instance,我一个canal服务大概20个instance。这个有限制吗

agapple commented 6 years ago

没有instance的数量限制啊,是否开启了scan=true,找到变更的文件

jingshenbusi6530 commented 6 years ago

25版本使用了nio做binlog读取,不太稳定,不建议使用了。。。 @agapple 那您建议使用哪个版本?可以稳定的读取binlog日志

agapple commented 6 years ago

1.0.26

kenmanoy commented 6 years ago

1.0.26也会时不时出现 Connection reset by peer,此时binlog变更可被捕捉。使用的客户端是官方example

agapple commented 6 years ago

fa34eb0e59f257574996cb03da58cd2e8353e94d,调整了下默认的空闲链接超时时间,之前只有1分钟空闲超时,调整为1小时

image

agapple commented 6 years ago

大伙对超时的管理不够精细,只能调大进行粗放型管理

fangchunsheng commented 6 years ago

我用的1.0.25版本,channel也老被关闭,一旦被关闭了,位点都不会再移动了,但是清掉meta.dat,重启就又ok了。如果是因为超时原因导致channel关闭了,eventParser会重新和mysql建立连接,按道理讲位点应该继续移动才对。

agapple commented 6 years ago

@fangchunsheng 你的位点可能是另一个问题,1.0.25和mysql交互使用了netty nio的模式,在26版本换成了bio,可以试试1.0.26

13581999682 commented 2 years ago

flink在写数据的时候报这个错,这个要咋解决啊,楼主? 2022-02-16 07:30:38,166 WARN org.apache.hadoop.hdfs.DFSClient [] - DataStreamer Exception java.nio.channels.ClosedByInterruptException: null at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) ~[?:1.8.0_221] at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:478) ~[?:1.8.0_221] at org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63) ~[hadoop-common-2.6.0-cdh5.13.0.jar:?] at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142) ~[hadoop-common-2.6.0-cdh5.13.0.jar:?] at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159) ~[hadoop-common-2.6.0-cdh5.13.0.jar:?] at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117) ~[hadoop-common-2.6.0-cdh5.13.0.jar:?] at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) ~[?:1.8.0_221] at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) ~[?:1.8.0_221] at java.io.DataOutputStream.flush(DataOutputStream.java:123) ~[?:1.8.0_221] at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:856) [hadoop-hdfs-2.6.0-cdh5.13.0.jar:?]

zm52hz commented 10 months ago

@agapple 想请教一下,canal客户端与服务端是长连接,为什么会出现请求超时的情况呢?