OpenEVSE / openevse_esp32_firmware

OpenEVSE V4 WiFi gateway using ESP32
153 stars 107 forks source link

OCPP - ConnectorId 1 gets stuck in 'Preparing' - DEV BUILD #864

Open Dicion opened 2 weeks ago

Dicion commented 2 weeks ago

NOTE: This is in the new OCPP-Merged Dev Build with the recent modifications to OCPP from @matth-x . No idea if it happens in production build, I haven't been able to reproduce it on demand yet, but I do have logs and data from when it was stuck in this status.

The dev build I was using was the one built right after my merge #854 I don't think anything significant has changed since then, but if there have been OCPP updates to fix this, let me know, please.

Interesting issue here. Somehow, my ConnectorId 1 got stuck in 'preparing' mode indefinitely. It caused the EVSE to show as 'preparing' and in use in my OCPP backend indefinitely.

Hard power cycles, wifi and controller resets didn't clear it, plugging and unplugging to a vehicle didn't clear it, absolutely nothing cleared it UNTIL I reverted back to Prod wifi firmware. That cleared it. Then I went back to DEV build, and it stayed cleared (so far). So something that's supposed to clear that status, isn't, when the system should go back to 'Available'.

Here is the EVSE when enabling OCPP with absolutely nothing connected to it in this state:

image

image

image

Here if after I reverted back to Prod firmware:

image

And THEN I went BACK to DEV Firmware, and it stayed 'available'

image

Things I THINK may have caused this, and need to try and test/replicate.

My OCPP Backend went down in the middle of a charge session (Server shutdown). The EVSE may have been in a charging or preparing state when the backend went down, I am not sure which at the time. The charging session ended, and I disconnected the vehicle before the OCPP backend was restored. When it was restored, the vehicle was disconnected. The EVSE may have been rebooted or power cycled between any of these states. The vehicle was on a scheduled charge timer, so it may have just been connected, preparing, but not actually charging when the backend went down.

Start OCPP Charging Session -> Preparing or Charging Status -> OCPP Backend Goes down -> Vehicle Disconnected from EVSE -> EVSE resets/reboots/power cycles possible -> OCPP Backend comes back up.

Some sort of state tracking may have fallen apart in the above sequence of events.

The interesting part is the General EVSE Status messages shown in the logs, appear to show proper state information. It shows no vehicle connected, state 254, vehicle 0, pilot at 24, etc, so it looks like the OpenEVSE controller is properly tracking and following the actual state, but the OCPP side is out of sync with it.

Status message example when in OCPP stuck 'preparing' state:

{"evse_connected":1,"amp":0,"voltage":240,"power":0,"pilot":24,"max_current":24,"temp":353,"temp_max":353,"temp1":false,"temp2":353,"temp3":false,"temp4":false,"state":254,"status":"disabled","flags":512,"vehicle":0,"colour":5,"manual_override":0,"freeram":166444,"divertmode":1,"srssi":-37,"time":"2024-06-19T12:34:18Z","local_time":"2024-06-19T07:34:18-0500","offset":"-0500","uptime":101,"session_elapsed":0,"session_energy":0,"total_energy":583.5667184,"total_day":0,"total_week":33.47784064,"total_month":339.1071385,"total_year":552.9589872,"total_switches":127,"elapsed":0,"wattsec":0,"watthour":583566.7184}

Status message once cleared via downgrade:

{"evse_connected":1,"amp":0,"voltage":240,"power":0,"pilot":24,"max_current":24,"temp":349,"temp_max":349,"temp1":false,"temp2":349,"temp3":false,"temp4":false,"state":254,"status":"disabled","flags":512,"vehicle":0,"colour":5,"manual_override":0,"freeram":132680,"divertmode":1,"srssi":-37,"time":"2024-06-19T12:45:44Z","local_time":"2024-06-19T07:45:44-0500","offset":"-0500","uptime":31,"session_elapsed":0,"session_energy":0,"total_energy":0,"total_day":0,"total_week":0,"total_month":0,"total_year":0,"total_switches":2,"elapsed":0,"wattsec":0,"watthour":0}

So it looks like the OCPP system isn't updating the status to match that of the controller. They're stuck disagreeing.

Again, I will update here if I can reliably reproduce.

matth-x commented 2 weeks ago

Hi @Dicion, thank you for the detailed report! I will take a deeper look as soon as I can.

The current dev version upgrades the OCPP library from version 0.3 to 1.1. My first idea is that this infinite Preparing state be due to storage layout changes. The new 1.1 version maybe sees an unfinished transaction on the flash partition which won't end anymore though. It wouldn't be hard to fix that.

If you're trying to reproduce this, can you erase the flash partition and start with the dev build? Alternatively, you could send a "ClearCache" command to the charger which also clears the transaction "cache" in this implementation.

Dicion commented 2 weeks ago

Just to clarify, when you say

If you're trying to reproduce this, can you erase the flash partition and start with the dev build? Alternatively, you could send a "ClearCache" command to the charger which also clears the transaction "cache" in this implementation.

You mean as a method of clearing the stuck state? If I manage to get it stuck again, try that to clear it? Or do you mean as a fresh start prior to trying to reproduce?

Just to clarify, I am already back on the 1.1 version dev build, and it is no longer stuck. Simply downgrading to the 0.3 version build cleared it, then I went back to the 1.1 dev build, and it stayed cleared. So as of right now, it is working properly. I've already done a few sessions on and off without interruptions and it works fine. Next will be simulating those interruptions and trying to get it stuck again.

matth-x commented 1 week ago

Perfect, no need to clear anything then. If you don't have any stuck transaction anymore, then the planned fix (clearing the transaction storage over version upgrades) will be sufficient.

Dicion commented 1 week ago

Perfect, no need to clear anything then. If you don't have any stuck transaction anymore, then the planned fix (clearing the transaction storage over version upgrades) will be sufficient.

Just to clarify, I was running on the OCPP 1.1 Dev build for about 2 weeks, since around June 5th, before this occurred. I don't think it was the result of the upgrade. It was working perfectly fine on the Dev build for dozens of charge sessions, until the one session where the OCPP backend went down mid-session. Only then did it get stuck, and I had no way of unsticking it without downgrading to the prod build.

So again, I don't believe it to be an artifact of the upgrade from Prod to Dev. I haven't had a chance to try to reproduce it recently due to being quite busy, but I will attempt to do so this week. I have had half a dozen charging sessions on the 1.1 Dev build since I reported this issue, and it's been working fine, but again, the OCPP backend didn't go down at all during them.

matth-x commented 3 days ago

Okay, thanks for clarifying this, I hoped the issues with stuck transactions would be gone. A better diagnostics function is already on its way which will help reconstruct such failures and fix them (I will update this issue once it's ready).