thingsboard / thingsboard

Open-source IoT Platform - Device management, data collection, processing and visualization.
https://thingsboard.io
Apache License 2.0
17.26k stars 5.09k forks source link

[Question] RPC Call Reply Failure #6308

Open cowherdboy opened 2 years ago

cowherdboy commented 2 years ago

Component

Description We have about 10 devices connected to Thingsboard via MQTT. All were working fine until we had to do some work on the network. The network was down for about 6 hours. When the network came back up the devices sent data for a few minutes and then 9/10 devices stopped sending telemetry at about the same time. The connectivity is ok as the device still sends RPC request to Thingsboard. To get Telemtery going we need to send RPC reply to the device with the timestamp. It seems to be failing at this step with the following error in the log:

{"deviceType":"default","requestId":"1","sessionId":"769566ac-46c4-4a37-a1df-24d4d01d0eb7","serviceId":"INTRAIOT02","deviceName":"G5_RHT_BE88"} 2022-03-23 22:05:47,052 [sql-log-1-thread-1] INFO o.t.s.dao.sql.TbSqlBlockingQueue - Queue-1 [TS] queueSize [0] totalAdded [30] totalSaved [30] totalFailed [0] 2022-03-23 22:05:47,053 [sql-log-1-thread-1] INFO o.t.s.dao.sql.TbSqlBlockingQueue - Queue-2 [TS] queueSize [0] totalAdded [56] totalSaved [56] totalFailed [0] 2022-03-23 22:05:47,130 [sql-log-1-thread-1] INFO o.t.s.dao.sql.TbSqlBlockingQueue - Queue-1 [TS Latest] queueSize [0] totalAdded [30] totalSaved [30] totalFailed [0] 2022-03-23 22:05:47,131 [sql-log-1-thread-1] INFO o.t.s.dao.sql.TbSqlBlockingQueue - Queue-2 [TS Latest] queueSize [0] totalAdded [56] totalSaved [56] totalFailed [0] 2022-03-23 22:05:50,454 [sql-log-1-thread-1] INFO o.t.s.dao.sql.TbSqlBlockingQueue - Queue-0 [Attributes] queueSize [0] totalAdded [11] totalSaved [11] totalFailed [0] 2022-03-23 22:05:50,454 [sql-log-1-thread-1] INFO o.t.s.dao.sql.TbSqlBlockingQueue - Queue-1 [Attributes] queueSize [0] totalAdded [16] totalSaved [24] totalFailed [0] 2022-03-23 22:05:50,454 [sql-log-1-thread-1] INFO o.t.s.dao.sql.TbSqlBlockingQueue - Queue-2 [Attributes] queueSize [0] totalAdded [1] totalSaved [1] totalFailed [0] 2022-03-23 22:05:51,200 [nioEventLoopGroup-4-4] ERROR o.t.s.t.mqtt.MqttTransportHandler - [769566ac-46c4-4a37-a1df-24d4d01d0eb7] Unexpected Exception java.io.IOException: Connection reset by peer at java.base/sun.nio.ch.FileDispatcherImpl.read0(Native Method) at java.base/sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at java.base/sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:276) at java.base/sun.nio.ch.IOUtil.read(IOUtil.java:233) at java.base/sun.nio.ch.IOUtil.read(IOUtil.java:223) at java.base/sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:356) at io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:253) at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1132) at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:350) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:151) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:722) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:658) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:584) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:829)

What could be the issue? We have rebooted our nodes and gateway. We also tried restarting Thingsboard but no luck. Could the network outage cause any adverse effects?

Environment

baigod commented 2 years ago

hi~ How to solve this problem? I also encountered the same problem. There are a large number of such errors on the server.