LibreSolar / bms-firmware

Firmware for LibreSolar BMS boards based on bq769x0, bq769x2 or ISL94202
https://libre.solar/bms-firmware/
Apache License 2.0
146 stars 67 forks source link

BMS communication error #67

Open Retro-Fitt opened 5 months ago

Retro-Fitt commented 5 months ago

Hi again,

Long time passed but, as you might remember i rolled my own hardware based on your "15s80" and "switch n sense" in here: https://github.com/LibreSolar/bms-15s80-sc/issues/6 I solved all my hardware problems as far as i am aware but i am having issues with recent bms firmware. Here is some 3d pictures for playing along at home:

3D

Pack1

Pack2

Pack3

I will open source all of this work on my github when i test pack enough.

Anyway first of all i setted up a new virtural machine and created new enviroment to compile new firmware and started from fresh. I will use this battery pack in electric go kart (for my kid) with hooverboard motor and ESC. It is in 9s2p configuration, my battery specs are:

Each Cell Capacity: 2900 mAh Battery Pack Total Capacity: 5.8Ah Battery Pack Voltages: Max Nominal Min: 37,8V 33,12V 22,5V Chemistry: Lithium nickel manganese cobalt oxide (Li-NMC)

I cloned "main" repository and build firmware with following changes to fit it on my hardware design which is almost identical with "15s80" hardware design. Same MCU same pinouts etc.

I changed "Max number of cells" to 10 (I believe bms firmware auto detects cell count.)

"Max number of thermistors" to 2,

"Battery nominal capacity" to 10, (I cannot set to 5.8Ah for my battery firmware doesn't allow me to set it?)

"Cell type" to NMC/Graphite 3.7V nominal 4.2 max, (Only for testing i choose this. My cells are supporting more than this range but for testing it's enough i believe)

"Default period (s) for live metrics" to 2 seconds,

then i changed as below:

SS4

board-max-current to 10

shunt-resistor-uohm to 4000

used-cell-channels to 0x3FF (My hardware have 76930) ( bq76920 (3-5s): 0b0000_0000_0001_1111 = 0x001F bq76930 (6-10s): 0b0000_0011_1111_1111 = 0x03FF bq76940 (9-15s): 0b0111_1111_1111_1111 = 0x7FFF)

and changed some parameters for my eeprom model.

Also i changed thermistor beta value to 3950.

The problem is when i compile firmware and upload to board ant test this out i get below errors:

"Loading data from storage failed" and "BMS coommunication error" then i cannot see any voltages from battery and it doesn't work obviously.

Here is a screenshot of this error: SS2

I believe MCU cannot speak with bq chip, but as you may see on screenshot it gives battery tempratures how this is possible if it cannot communucate with it?

I tested with an older firmware which i compiled before with same hardware:

SS3

Which is working OK. But i want to use more recent version because of for example "cell temps","ic temps", "mosfet temps" etc. and it compiles and works without modifying prj.conf file for "newlibc nano" and "heap" size parameters.

Also another question you might call me a newbie but i cannot decode "error flags","bms state" or "balancingstatus" codes i mean for example 0x00000280 error code mean?or bms state 3 means? I am not good at coding but I checked and read source code (bms_common.h, bms_ic.h, bq769x0.c) many times but i cannot figure it out. Is there any easy way to decode this codes?

Regards all.

martinjaeger commented 5 months ago

The pack looks really nice! And the firmware should also be almost ready.

  1. EEPROM: Do you actually have one on your board? This looks like it's reading garbage from the bus, as the header and CRC information is always different in each attempt. Also looks like this doesn't work with the older firmware either.

  2. BMS communication failure: Which exact part number do you have? Did you double-check that the I2C address is correct (see below table from the datasheet). The previous firmware auto-detected the I2C address and whether or not to use CRC, whereas the new version only detects the CRC and the address specified in devicetree must be correct.

image

If that's all correct, this may be a bug in the driver. I have tested it on my board after the refactoring, but maybe I missed something. Do you have a logic analyzer to check the bus traffic?

I can't really explain right now why it seems to be able to read some temperature values, but it's not able to read anything else.

Also the state machine shouldn't go into discharging state if it can't read proper voltages. I'll need to have a closer look why this is happening.

  1. Error flags:

They can be found in the bms_common.h: https://github.com/LibreSolar/bms-firmware/blob/fb078cc4b56179d12973b2ab40afc57f59054fa0/include/bms/bms_common.h#L28

The error flags are defined as individual bits. In your example 0x280 the bits 7 and 9 are set, which means discharge and charge overtemp.

I just realized that these defines are not rendered in the docs. Will have a look into that and fix it.

  1. BMS state

This is defined in bms.h: https://github.com/LibreSolar/bms-firmware/blob/fb078cc4b56179d12973b2ab40afc57f59054fa0/include/bms/bms.h#L32

Enums always start at 0, so a BMS state of 3 means BMS_STATE_NORMAL.

  1. Balancing status:

Same as error flags, but the individual bits indicate the balancing status of each cell. As an example, 0x0004 (in binary 0b0000_0000_0000_0100) means that cell 2 (starting to count from cell 0) is currently being balanced.

Retro-Fitt commented 5 months ago

@martinjaeger

Many many thanks for reply.

1-EEPROM: Yes i do have one on my hardware. It is not same brand and chip as your hardware design i believe. Here is it's datasheet. I modified devicetree according this datasheet but i might be wrong about "size", "pagesize", "address-width" and "timeout". I double checked my hardware design it seems OK. Which i believe i have a software issue rather than hardware issue. I don't know if it is related to EEPROM issue but State of Charge value starts from 0 if i turn off and turn again this bms? In a nutshell i am not sure if my EEPROM works properly with old firmware and new firmware.

2-BMS communication failure: I have BQ7693003DBTR, which i believe have 0x08 i2c address and also have CRC. I have cheap Cypress EZ-USB FX2LP-based Logic Analyzer with sigrok and 4 channel Rigol DS1054Z scope which have i2c decoding capabilities. (I am not sure about scope can decode i2c traffic between MCU and bq chip since it is not exact i2c protocol) So i can check bus acticity and post here.

Here is my devicetree and schematic summary: ss9

3-Error flags: I am so sorry but i couldn't understand how did you convert0x280 to the bits 7 and 9? Could you please explain in detail?

4-BMS state: Understood very well.

5-Balancing status: I think i understand this. For example in my previous tests balancing value was 258. I converted decimal to binary and get 0000000100000010 which indicates CELL 1 and CELL 8 (starting to count from cell 0) are balancing right?

t1

EDIT: Now i tested with 0x18 address to see how it behaves now it gives:

E: Error reading current measurement
E: Failed to read data from BMS IC: -5

0x18

Which i assume0x08 address is correct and it can communucate with bq chip but not in right way.

Side note: Also it behaves like this when i disconnect cell balance and main terminals and supply MCU from ST-LINK to program MCU which is understandable because bq chip is not powered with battery so it cannot respond to MCU.

martinjaeger commented 5 months ago

1. and 2. Ok, EEPROM and BMS devicetree all looks good then. So it must be a software issue, indeed. The communication is "normal" I2C and you should be able to analyze it with a Logic Analyzer.

3. It's the same as for the balancing status. Here is a screenshot from the Linux calculator I usually use:

image

5. Yes, the balancing status is correct.