steve-community / steve

SteVe - OCPP server implementation in Java
GNU General Public License v3.0
796 stars 389 forks source link

Exceptions and websocket closures on StopTransaction #1141

Open minhtrietdiep opened 1 year ago

minhtrietdiep commented 1 year ago

Checklist

Specifications

SteVe Version     : 3.5.0
Operating system  : ...
JDK               : ...
Database          : MariaDB

Expected Behavior

SteVe responds to StopTransaction normally.

Actual Behavior

SteVe takes some time after receiving a StopTransaction, then proceeds to get a database access error :

org.jooq.exception.DataAccessException: SQL [insert into `stevedb`.`transaction_stop` (`transaction_pk`, `event_timestamp`, `event_actor`, `stop_timestamp`, `stop_value`, `stop_reason`) values (?, ?, ?, ?, ?, ?)]; Cannot add or update a child row: a foreign key constraint fails (`stevedb`.`transaction_stop`, CONSTRAINT `FK_transaction_stop_transaction_pk` FOREIGN KEY (`transaction_pk`) REFERENCES `transaction_start` (`transaction_pk`) ON DELETE CASCADE ON UPDATE NO AC)

Then, somehow seems to accept the tag, sends the message, but doesn't manage to do that and the websocket connection closes.

Steps to Reproduce the Problem

N/A, hard to reproduce from a clean slate.

Additional context

12 chargers had their configuration settings changed, and were set to do a soft reset to apply these changes. Some vehicles were in a charging session. The StopTransaction messages seem to be received by SteVe, but nothing happens. The implementation of the charge points causes this StopTransaction to be re-sent after an apparent no-acknowledge.

This just keeps repeating until SteVe crashes and restarts, with the log being filled with exceptions such as these:

[ERROR] 2023-05-09 17:00:07,600 de.rwth.idsg.steve.ocpp.ws.WebSocketLogger - [chargeBoxId=CHARGER-XXX, sessionId=12f2762b-b39e-5988-7457-1ec64a5d85c0] Transport error
org.eclipse.jetty.io.EofException: null
    at org.eclipse.jetty.io.SocketChannelEndPoint.flush(SocketChannelEndPoint.java:116) ~[jetty-io-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.io.WriteFlusher.flush(WriteFlusher.java:422) ~[jetty-io-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.io.WriteFlusher.write(WriteFlusher.java:275) ~[jetty-io-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.io.WriteFlusher.write(WriteFlusher.java:254) ~[jetty-io-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.io.AbstractEndPoint.write(AbstractEndPoint.java:386) ~[jetty-io-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.websocket.core.internal.FrameFlusher.process(FrameFlusher.java:330) ~[websocket-core-common-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.util.IteratingCallback.processing(IteratingCallback.java:232) ~[jetty-util-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.util.IteratingCallback.iterate(IteratingCallback.java:214) ~[jetty-util-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.websocket.core.internal.WebSocketConnection.enqueueFrame(WebSocketConnection.java:640) ~[websocket-core-common-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession$OutgoingAdaptor.sendFrame(WebSocketCoreSession.java:724) ~[websocket-core-common-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.websocket.core.internal.ExtensionStack.sendFrame(ExtensionStack.java:258) ~[websocket-core-common-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession$Flusher.forwardFrame(WebSocketCoreSession.java:815) ~[websocket-core-common-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.websocket.core.internal.FragmentingFlusher.onFrame(FragmentingFlusher.java:50) ~[websocket-core-common-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.websocket.core.internal.TransformingFlusher$Flusher.process(TransformingFlusher.java:163) ~[websocket-core-common-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.util.IteratingCallback.processing(IteratingCallback.java:232) ~[jetty-util-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.util.IteratingCallback.iterate(IteratingCallback.java:214) ~[jetty-util-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.websocket.core.internal.TransformingFlusher.sendFrame(TransformingFlusher.java:77) ~[websocket-core-common-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession.sendFrame(WebSocketCoreSession.java:526) ~[websocket-core-common-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.websocket.common.JettyWebSocketRemoteEndpoint.sendPong(JettyWebSocketRemoteEndpoint.java:188) ~[websocket-jetty-common-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.websocket.common.JettyWebSocketFrameHandler.onPingFrame(JettyWebSocketFrameHandler.java:373) ~[websocket-jetty-common-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.websocket.common.JettyWebSocketFrameHandler.onFrame(JettyWebSocketFrameHandler.java:234) ~[websocket-jetty-common-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession$IncomingAdaptor.lambda$onFrame$1(WebSocketCoreSession.java:671) ~[websocket-core-common-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.server.handler.ContextHandler.handle(ContextHandler.java:1463) ~[jetty-server-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.server.handler.ContextHandler.handle(ContextHandler.java:1482) ~[jetty-server-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.websocket.core.server.internal.AbstractHandshaker$1.handle(AbstractHandshaker.java:212) ~[websocket-core-server-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession$IncomingAdaptor.onFrame(WebSocketCoreSession.java:671) ~[websocket-core-common-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.websocket.core.internal.ExtensionStack.onFrame(ExtensionStack.java:120) ~[websocket-core-common-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession.onFrame(WebSocketCoreSession.java:481) ~[websocket-core-common-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.websocket.core.internal.WebSocketConnection.onFrame(WebSocketConnection.java:262) ~[websocket-core-common-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.websocket.core.internal.WebSocketConnection.fillAndParse(WebSocketConnection.java:455) ~[websocket-core-common-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.websocket.core.internal.WebSocketConnection.onFillable(WebSocketConnection.java:340) ~[websocket-core-common-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:319) ~[jetty-io-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100) ~[jetty-io-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53) ~[jetty-io-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:412) ~[jetty-util-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:381) ~[jetty-util-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:268) ~[jetty-util-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.lambda$new$0(AdaptiveExecutionStrategy.java:138) ~[jetty-util-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:407) ~[jetty-util-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:894) ~[jetty-util-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1038) ~[jetty-util-10.0.11.jar:10.0.11]
    at java.lang.Thread.run(Thread.java:829) ~[?:?]
Caused by: java.io.IOException: Broken pipe
    at sun.nio.ch.FileDispatcherImpl.writev0(Native Method) ~[?:?]
    at sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51) ~[?:?]
    at sun.nio.ch.IOUtil.write(IOUtil.java:182) ~[?:?]
    at sun.nio.ch.IOUtil.write(IOUtil.java:130) ~[?:?]
    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:493) ~[?:?]
    at java.nio.channels.SocketChannel.write(SocketChannel.java:507) ~[?:?]
    at org.eclipse.jetty.io.SocketChannelEndPoint.flush(SocketChannelEndPoint.java:110) ~[jetty-io-10.0.11.jar:10.0.11]
    ... 41 more
[WARN ] 2023-05-09 17:00:07,600 de.rwth.idsg.steve.ocpp.ws.WebSocketLogger - [chargeBoxId=CHARGER-XXX, sessionId=12f2762b-b39e-5988-7457-1ec64a5d85c0] Connection is closed, status: CloseStatus[code=1006, reason=null]
[WARN ] 2023-05-09 17:00:07,601 org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy - Task run failed
java.lang.IllegalStateException: already released RetainableByteBuffer@e2e9a75{DirectByteBuffer@75e1cfee[p=0,l=0,c=8388608,r=0]={<<<>>>\x89\x90\x11\x1a\xAb\x9eocp...\x00\x00\x00\x00\x00\x00\x00},r=0}
    at org.eclipse.jetty.io.RetainableByteBuffer.lambda$release$2(RetainableByteBuffer.java:109) ~[jetty-io-10.0.11.jar:10.0.11]
    at java.util.concurrent.atomic.AtomicInteger.updateAndGet(AtomicInteger.java:279) ~[?:?]
    at org.eclipse.jetty.io.RetainableByteBuffer.release(RetainableByteBuffer.java:106) ~[jetty-io-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.websocket.core.internal.WebSocketConnection.releaseNetworkBuffer(WebSocketConnection.java:330) ~[websocket-core-common-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.websocket.core.internal.WebSocketConnection.fillAndParse(WebSocketConnection.java:504) ~[websocket-core-common-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.websocket.core.internal.WebSocketConnection.onFillable(WebSocketConnection.java:340) ~[websocket-core-common-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:319) ~[jetty-io-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100) ~[jetty-io-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53) ~[jetty-io-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:412) ~[jetty-util-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:381) ~[jetty-util-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:268) ~[jetty-util-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.lambda$new$0(AdaptiveExecutionStrategy.java:138) ~[jetty-util-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:407) ~[jetty-util-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:894) ~[jetty-util-10.0.11.jar:10.0.11]
    at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1038) ~[jetty-util-10.0.11.jar:10.0.11]
    at java.lang.Thread.run(Thread.java:829) ~[?:?]

The 12 offending chargers have been removed in SteVe, but the issue still occurs for other chargers that send a StopTransaction, resulting in the same behavior of SteVe getting sluggish and finally seemingly crashing.

Even tasks like GetConfiguration will not successfully execute, if one gets through to their respective pages.

From our IT department, SteVe has already been assigned more CPU and RAM, and more connections to MariaDB, but it wasn't hitting those limits in the first place, and shouldn't be access-limited by the DB software. They also do not know what could be the issue. SteVe of course has been restarted several times.

What could this have been caused by and how do we recover from this, without nuking the database?

minhtrietdiep commented 1 year ago

Additional information:

So we've figured out more: SteVe was slow with Authorization/StartTransaction accepts (in the order of 10 seconds). The moment our multiple chargers come back online, SteVe seems to get overwhelmed, and isn't able to acknowledge the StartTransaction - which causes the chargers to keep retrying.

It does appear SteVe can still send some messages (to get diagnostics), it just can't respond in time to messages from the chargers...

fnkbsi commented 1 year ago

Hi @minhtrietdiep, would you please ask your IT department if there is a slow sql-query time correlated to StartTransaction? thx