Open minhtrietdiep opened 1 year ago
Additional information:
So we've figured out more: SteVe was slow with Authorization/StartTransaction accepts (in the order of 10 seconds). The moment our multiple chargers come back online, SteVe seems to get overwhelmed, and isn't able to acknowledge the StartTransaction - which causes the chargers to keep retrying.
It does appear SteVe can still send some messages (to get diagnostics), it just can't respond in time to messages from the chargers...
Hi @minhtrietdiep, would you please ask your IT department if there is a slow sql-query time correlated to StartTransaction? thx
Checklist
Specifications
Expected Behavior
SteVe responds to StopTransaction normally.
Actual Behavior
SteVe takes some time after receiving a StopTransaction, then proceeds to get a database access error :
Then, somehow seems to accept the tag, sends the message, but doesn't manage to do that and the websocket connection closes.
Steps to Reproduce the Problem
N/A, hard to reproduce from a clean slate.
Additional context
12 chargers had their configuration settings changed, and were set to do a soft reset to apply these changes. Some vehicles were in a charging session. The StopTransaction messages seem to be received by SteVe, but nothing happens. The implementation of the charge points causes this StopTransaction to be re-sent after an apparent no-acknowledge.
This just keeps repeating until SteVe crashes and restarts, with the log being filled with exceptions such as these:
The 12 offending chargers have been removed in SteVe, but the issue still occurs for other chargers that send a StopTransaction, resulting in the same behavior of SteVe getting sluggish and finally seemingly crashing.
Even tasks like GetConfiguration will not successfully execute, if one gets through to their respective pages.
From our IT department, SteVe has already been assigned more CPU and RAM, and more connections to MariaDB, but it wasn't hitting those limits in the first place, and shouldn't be access-limited by the DB software. They also do not know what could be the issue. SteVe of course has been restarted several times.
What could this have been caused by and how do we recover from this, without nuking the database?