apache / plc4x

PLC4X The Industrial IoT adapter
https://plc4x.apache.org/
Apache License 2.0
1.26k stars 402 forks source link

[Bug]: `CachedPlcConnectionManager` get stuck if the connection with the PLC has a temporary interruption #1204

Open foxpluto opened 11 months ago

foxpluto commented 11 months ago

What happened?

HI,

I am testing the CachedPlcConnectionManager and I found a bug. I am using PLC Siemens S7 1200, I am testing the reliability of the communication between PLC4x and PLC in case of drop of the connection. This is quite important for me because the PLC is not always connected to the server I am using to query data to the PLC. The code I wrote to distribute the CachedPlcConnectionManager is:

    /**
     * Build up a new {@link CachedPlcConnectionManager}
     */
    private CachedPlcConnectionManager getNew() {

        CachedPlcConnectionManager cachedPlcConnectionManagerTemp = CachedPlcConnectionManager.getBuilder()
                                                                                              .withMaxLeaseTime(Duration.ofMillis(700))
                                                                                              .withMaxWaitTime(Duration.ofMillis(1000))
                                                                                              .build();
        log.warn("New Driver Manager Found: {}", cachedPlcConnectionManagerTemp);
        return cachedPlcConnectionManagerTemp;
    }

This method is used to get the connection in various part of my code in this way:

try (var plcConnection = cachedPlcConnectionManager.getConnection(plcConnectionString)) {
.... do stuff
}

Everything is fine since I don't plug the cable for testing the reliability, the errors are correctly trapped and handled but as soon as the connection is back I got this error:

[DEBUG] 22:26:28.089 org.apache.plc4x.java.utils.cache.CachedPlcConnectionManager.getConnection() - Reusing exising connection
[WARN ] 22:26:29.422 org.apache.plc4x.java.s7.readwrite.protocol.S7ProtocolLogic.lambda$0() - Timeout during Connection establishing, closing channel...
[WARN ] 22:26:29.426 io.netty.channel.AbstractChannelHandlerContext.tryFailure() - Failed to mark a promise as failure because it has succeeded already: DefaultChannelPromise@50a28038(success)
 java.lang.IllegalStateException: HashedWheelTimer.stop() cannot be called from TimerTask
        at io.netty.util.HashedWheelTimer.stop(HashedWheelTimer.java:393) ~[netty-common-4.1.99.Final.jar:4.1.99.Final]
        at org.apache.plc4x.java.spi.netty.NettyHashTimerTimeoutManager.stop(NettyHashTimerTimeoutManager.java:63) ~[plc4j-spi-0.11.0.jar:0.11.0]
        at org.apache.plc4x.java.spi.Plc4xNettyWrapper.close(Plc4xNettyWrapper.java:130) ~[plc4j-spi-0.11.0.jar:0.11.0]
        at io.netty.channel.AbstractChannelHandlerContext.invokeClose(AbstractChannelHandlerContext.java:751) ~[netty-transport-4.1.99.Final.jar:4.1.99.Final]
        at io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:727) ~[netty-transport-4.1.99.Final.jar:4.1.99.Final]
        at io.netty.channel.DefaultChannelPipeline.close(DefaultChannelPipeline.java:994) ~[netty-transport-4.1.99.Final.jar:4.1.99.Final]
        at io.netty.channel.AbstractChannel.close(AbstractChannel.java:280) ~[netty-transport-4.1.99.Final.jar:4.1.99.Final]
        at io.netty.channel.embedded.EmbeddedChannel.close(EmbeddedChannel.java:569) ~[netty-transport-4.1.99.Final.jar:4.1.99.Final]
        at io.netty.channel.embedded.EmbeddedChannel.close(EmbeddedChannel.java:556) ~[netty-transport-4.1.99.Final.jar:4.1.99.Final]
        at org.apache.plc4x.java.s7.readwrite.protocol.S7ProtocolLogic.lambda$0(S7ProtocolLogic.java:156) ~[plc4j-driver-s7-0.11.0.jar:0.11.0]
        at org.apache.plc4x.java.spi.Plc4xNettyWrapper$3.accept(Plc4xNettyWrapper.java:245) ~[plc4j-spi-0.11.0.jar:0.11.0]
        at org.apache.plc4x.java.spi.Plc4xNettyWrapper$3.accept(Plc4xNettyWrapper.java:1) ~[plc4j-spi-0.11.0.jar:0.11.0]
        at org.apache.plc4x.java.spi.netty.NettyHashTimerTimeoutManager.lambda$0(NettyHashTimerTimeoutManager.java:55) ~[plc4j-spi-0.11.0.jar:0.11.0]
        at io.netty.util.HashedWheelTimer$HashedWheelTimeout.run(HashedWheelTimer.java:715) [netty-common-4.1.99.Final.jar:4.1.99.Final]
        at io.netty.util.concurrent.ImmediateExecutor.execute(ImmediateExecutor.java:34) [netty-common-4.1.99.Final.jar:4.1.99.Final]
        at io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:703) [netty-common-4.1.99.Final.jar:4.1.99.Final]
        at io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:790) [netty-common-4.1.99.Final.jar:4.1.99.Final]
        at io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:503) [netty-common-4.1.99.Final.jar:4.1.99.Final]
        at java.lang.Thread.run(Thread.java:833) [?:?]

Since that error all the request made to have a connection from the CachedPlcConnectionManager got stuck, never exit and my entire application is dead waiting forever a connection !!!!

As you can see from the log the problem is inside the driver:

java.lang.IllegalStateException: HashedWheelTimer.stop() cannot be called from TimerTask

That's a very scary error !!!

What happen here?

Thanks, S.

Version

V0.11.0

Programming Languages

Protocols

chrisdutz commented 8 months ago

I'll try and look into this.