stuartpittaway / diyBMSv4ESP32

diyBMS v4 code for the ESP32 and new controller hardware
Other
166 stars 78 forks source link

Randomly disabling relay when battery is almost to the top, no rules seem to be afected. #282

Open DrBit opened 3 months ago

DrBit commented 3 months ago

Describe the bug Controller will disable the relay provinding power to inverter even when no rules have been triggered. This usually happens when battery is at/or close to 95% - 100%, there is no rule triggered and it never comes back from this state, so reset it's needed to make it work again. Aproximatle happens every 3 or 4 times a week. Is there a way to track exactly what triggered the disconection of the relay? Charge/Discharge configuration are disabled. No canbus connection. Inverter is victron multiplus II, MQTT disbaled, influx disabled, no current monitor, no storage attached and no logging to SD.

This has been happening for a while and also with older versions of software, updates didn't help. I'm attaching screenshots of controller after error ocurred. Aprox time of error 17:05 (check history) clue: cell 22 suddently is the highest but checking voltgages everything looks normal.

Could it be that due to high voltages (DC or/and AC) the emergency stop is somehow triggered even if there is nothing connected at J1 and that it is disabled in rules?

Hardware/Software Versions Controller version (from PCB): 4.2 - 25 Feb 2021 (below can be obtained from the "About" page in the controller web interface) Processor: ESP32 Version: [0f7f03c1c13382ed7031464baa92671fe778afc3] Compiled: 2023-10-30T10:40:07.498Z

To Reproduce Steps to reproduce the behavior:

  1. When on almost full charge of battery
  2. Randomly relay is disconected
  3. Needs reset to go back to function

battery state after fail History when error happened modules state after fail rules state after fail

stuartpittaway commented 3 months ago

Ok - I can see 2 modules appear to have "bad packet counts" this indicates poor communication between the cell modules.

Its possible that the BMS triggers the "internal BMS error" if communication is lost for a period of time. You would need to enable some sort of logging to see that - either MQTT or the SD card.

Those both publish the rule and relay states so you can see what was triggered.

DrBit commented 3 months ago

A bit of follow up on this issue.

steps done:

Resoldered the 2 faulty modules and no error pakets found yet, so for now I would rule out internal BMS error. started loggin from MQTT to nodered

Dicovered:

No rules triggered (at least that I could see) The controller repots the 2 relays as ON, (I added second one just for test) Yet both relays started as ON but at a certain point they both whent off. They are now OFF, led is off. After reset they all both go ON again.

How can this be possible? If there is an internal error the loggin would still show it , right? How can I further investigate this issue, any ideas?

MQTT log no errors on modules

stuartpittaway commented 3 months ago

Ok, further logging is available using a usb cable to the esp32 and logging it's serial output to a file.

DrBit commented 3 months ago

Hi there,

finally got the chance to capture the serial output when the error ocurred, here a screnshoot. Apparently looks like a rule is triggering but strange I don't see it in the web interface.

Aslo from the log file, seems that the relay should be turn on again right after, but this never happens. Is it a bug or is it supposed to be like this? Can anybody confirm this?

Just as a reminder. When battery is getting close or at 100% relay goes off. There is no rule triggered that you can see in the web interface and MQTT does not reflect either any rule triggered whatoever. bug? The only thing I can see is what I marked in the serial capture. Quite difficult to see if its a brief one time event. Maybe log this events like in history window would help for troubleshooting?

Anyway I really do not understand how this rule can be triggered. MAx voltage for cell is 4.1 and max voltage for pack is 57.4 (14 cells) so in theory to trigger this rule all cells should be perfectly ballanced and still if that was the case both max cell and max pack would trigger at the same time.

Any ideas? could be maybe a small interference of external source that is messing aroung with the mesurements?

Capture

red0909 commented 3 months ago

i never had this problem with 4.4 and 4.5 controller, but i dont have relays for bms internal error rule

stuartpittaway commented 3 months ago

Sorry for the slow reply, I've had a few days holiday @DrBit

The tcaXXXX_icr messages are interrupt triggers when those chips receive an external signal (generally the I/O header or the eSTOP) but they could also be caused by static if those pins are not used.

You can see in the serial dump that "bank range" rule gets triggered, and then 4 seconds later is "untriggered".

I think this may be related to an "old" bug there a bank voltage is only partially updated as the controller is waiting for the other half of the bank to return cell voltage data.

The controller talks to the cells 16 at a time - so in your case 2 cells from bank 1 are read before all the others.

jetronic18s commented 2 months ago

Hello everyone, I think I was able to observe the same thing at the weekend.

My configuration is an 18s1p battery with a controller board v4.5

Unfortunately I don't have a serial recording and can't debug any further at the moment.

Connections are all good, no errors and no wrong packet count. All parameters were in the green range. Then the relay briefly switched off and on again when the battery was almost full.

Translated with DeepL.com (free version)

DrBit commented 2 months ago

Hi stuart

You can see in the serial dump that "bank range" rule gets triggered, and then 4 seconds later is "untriggered".

The problem I got is that the relay will never "untrigger" even if the log says it does.

I think I've overcome the issue by increasing pack voltage (and relaing on cell voltage max) and also for good mesure I've also increased max deviation voltage.

Now seems the problem has disappeard, I will keep an eye on it

The issue remains that a rule is triggered and no signal on the web can be seen and the problem taht the relay won't come back to on.