syssi / esphome-jk-bms

ESPHome component to monitor and control a Jikong Battery Management System (JK-BMS) via UART-TTL or BLE
Apache License 2.0
403 stars 134 forks source link

Component jk_modbus took a long time for an operation.... and cause esp to halt after sometime #493

Open paulsteigel opened 2 months ago

paulsteigel commented 2 months ago

Hi @syssi I am using your jk-bms components for quite sometimes and I discovered the following (may be related, I am not quite sure) My setting was: esp32S2Lolin mini uart on Gpio9 and Gpio10 I used Tja1050 for can bus on gpio16 and gpio17 with a 4.7K resistor connected in series to RX pin on esp32. I used Interval to send can message (every 2 seconds)

I observed the following symptom: Esphome reported series of warning like this

[20:36:08][W][component:237]: Component jk_modbus took a long time for an operation (215 ms).
[20:36:08][W][component:238]: Components should block for at most 30 ms.

Every time this happens, CAN bus stop working, can message is no longer sent out. I tried setting logger to verbose for checking modbus log and found out that command send/receive is okay, the only thing reported was the modbus take long time...etc.

[21:12:21][D][text_sensor:064]: 'nktienvphuc manufacturer': Sending state 'Input UserdaJK_B1A20S15P'
[21:12:21][V][mqtt:474]: Publish(topic='nktienvphuc/sensor/nktienvphuc_manufacturer/state' payload='Input UserdaJK_B1A20S15P' retain=1 qos=0)
[21:12:21][W][component:237]: Component jk_modbus took a long time for an operation (790 ms).
[21:12:21][W][component:238]: Components should block for at most 30 ms.
[21:12:21][D][uart_debug:114]: <<< A0:00:46:A1:00:3C:A2:00:14:A3:00:46:A4:00:46:A5:FF:EC:A6:FF:F6:A7:FF:EC:A8:FF:F6:A9:10:AA:00:00:01:18:AB:01:AC:01:AD:03:FF:AE:01:AF:00:B0:00:0A:B1:14:B2:32:31:31:30:31:39:39:31:00:00:B3:00:B4:49:6E:70:75:74:20:55:73:B5:32:34:30:34:B6:00:00:51:84:B7:31:31:2E:58:41:5F:53:31:31:2E:34:37:5F:5F:5F:B8:00:B9:00:00:01:18:BA:49:6E:70:75:74:20:55:73:65:72:64:61:4A:4B:5F:42:31:41:32:30:53:31:35:50:C0:01:00:00:00:00:68:00:00:58:64
[21:12:21][V][mqtt.idf:119]: Event dispatched from event loop event_id=3
[21:12:21][V][mqtt.idf:138]: MQTT_EVENT_SUBSCRIBED, msg_id=48985
[21:12:21][C][logger:166]: Logger:
[21:12:21][C][logger:167]:   Level: VERBOSE
[21:12:21][C][logger:169]:   Log Baud Rate: 115200

and it took another 1-4 minutes (not fixed) to start can bus messaging.

[21:16:17][D][light:036]: 'Blue LED' Setting:
[21:16:17][D][light:047]:   State: OFF
[21:16:17][V][text_sensor:013]: 'nktienvphuc CANBUS Status': Received new state ON
[21:16:17][D][text_sensor:064]: 'nktienvphuc CANBUS Status': Sending state 'ON'
[21:16:17][V][mqtt:474]: Publish(topic='nktienvphuc/sensor/nktienvphuc_canbus_status/state' payload='ON' retain=1 qos=0)
[21:16:17][I][main:614]: received can id: 0x305 ACK
[21:16:17][D][uart_debug:114]: >>> 4E:57:00:13:00:00:00:00:06:02:00:00:00:00:00:00:68:00:00:01:28
[21:16:17][D][uart_debug:114]: <<< 4E:57:01:21:00:00:00:00:06:00:01:79:30:01:0C:D3:02:0C:D2:03:0C:D2:04:0C:D3:05:0C:D1:06:0C:D3:07:0C:D2:08:0C:D3:09:0C:D2:0A:0C:D3:0B:0C:D2:0C:0C:D3:0D:0C:D3:0E:0C:D3:0F:0C:D3:10:0C:D3:80:00:22:81:00:1E:82:00:1F:83:14:83:84:03:6F:85:48:86:02:87:00:05:89:00:00:06:36:8A:00:10:8B:00:00:8C:00:03:8E:15:60:8F:11:D0:90:0D:5C:91:0D:52:92:00:03:93:0B:22:94:0B:86:95:00:03:96:01:2C:97:00:78:98:01:2C:99:00:41:9A:00:1E:9B:0B:B8:9C:00:03:9D:01:9E:00:64:9F:00:50
[21:16:17][I][jk_bms:071]: Status frame received

The problem does not happen repeatedly, some success and some not (above log). It often apear 5 times then ok then back with 5 times of problem. Can you please be so kind advising me how to deal with that? Are there any routine to check for problem for modbus (such as heartbeat) so when it has problem, it would trigger esp to restart? or changing the interval of requesting for BMS data or something else! Best, Ngoc!

syssi commented 2 months ago

Could you provide your complete YAML? I'm a bit confused because you mentioned CAN bus communication and a crash/reboot which isn't part of the log above.

The warning doesn't cause reboots normally. Could you remove the jk_bms and fake some required measurements? I guess there will be the same drop-outs also if the component isn't in place. In other words: I guess your issue is unrelated to the jk_bms component.

paulsteigel commented 2 months ago

Hi @syssi I am sorry for posting on the wrong area, I tried reposting on JK-BMS. But you are so responsive so, let me post it here for you!. Please be so kind to help with thanks!

external_components_source: github://syssi/esphome-jk-bms@main
  # components
  # github://syssi/esphome-jk-bms@main  

esphome:
  #includes: wifi_config.cpp
  name: ${name}
  on_boot:
    then:
      - switch.turn_on: switch_charging
      - switch.turn_on: switch_discharging
      - switch.turn_on: switch_chg_float

# +--------------------------------------+
# | ESP32 settings                       |
# +--------------------------------------+
# For a stable Bluetooth connection keep the "esp-idf" framework
esp32:
  board: lolin_s2_mini
#  board: esp32doit-devkit-v1
#  framework:
#    type: esp-idf

# for automatic updating new ip on changes
time:
  - platform: sntp
    # ...

external_components:
  - source: ${external_components_source}
    refresh: 0s
  - source: github://pr#5586
    components: [ ota_http ]

logger:
  level: DEBUG

packages:
  extrapkg: !include ./commons/dyndns.yaml

# Please use the native `api` component instead of the `mqtt` section.
# If you use Home Assistant, the native API is more lightweight.
# If there is no HA server connected to this API, the ESP32 reboots every 15 minutes to try to resolve the problem.
# If you don't use Home Assistant please uncomment the "reboot_timeout: 0s" option.
#api:
#  reboot_timeout: 0s

# If you don't want to use ESPHome's native API you can use MQQT instead.
# In this case don't forget to remove the 'api:' section.
mqtt:
  broker: myhome.sfdp.net.com
  username: mqtt_user
  password: abcxya
  client_id: ${name}_mqtt

# In the event of problems with the WiFi network, the ESP32 will reboot every 15 minutes to try to resolve the problem.
# If we don't want to connect the ESP32 to the WiFi network please remove the 4 lines below.

wifi:
  use_address: ${name}.freemyip.com
  ap:
    ssid: "ESP32S2Lolin"
#    password: !secret wifi_ap_password
    ap_timeout: 10min

captive_portal:

ota:
  password: abcxyz
  # Forward this TCP port to do OTA updates remotely
  port: 8266

<<: !include ./commons/ota_http.yaml

#web_server:
#  port: 80
#  log: false
#  ota: false

<<: !include ./commons/globals.yaml

output:
  - platform: gpio
    pin: 15
    id: led
    inverted: true

light:
  - platform: binary
    output: led
    id: blue_led
    name: "Blue LED"
    internal: true

# +--------------------------------------+
# | JK-BMS UART connection               |
# +--------------------------------------+

uart:
  id: uart_0
  baud_rate: 115200
  rx_buffer_size: 384
  tx_pin: ${tx_pin}
  rx_pin: ${rx_pin}
  debug:
    direction: BOTH

jk_modbus:
  id: modbus0
  uart_id: uart_0

jk_bms:
  id: bms0
  jk_modbus_id: modbus0
  update_interval:  2s
  # enable_fake_traffic: true

.....
# +--------------------------------------+
# | CAN bus script                       |
# +--------------------------------------+
canbus:
  - platform: esp32_can
    tx_pin: ${can_tx_pin}
    rx_pin: ${can_rx_pin}
    can_id: 4
    bit_rate: 500kbps
    on_frame:
    - can_id: 0x305 # Inverter ACK - SMA/LG/Pylon/Goodwe reply
      then:
        - light.toggle:
            id: blue_led
        - lambda: |-
            id(can_ack_counter) = 0;                              // Reset ACK counter
            id(can_status) = "ON";                                // Set CANBUS Status to ON
            id(canbus_status).publish_state(id(can_status));      // Publish text sensor
            ESP_LOGI("main", "received can id: 0x305 ACK");

interval:
  # activate the limits check
  #- interval: 60s
  #  then:
  #    - lambda: |-
  #        ESP_LOGI("main", "Trigger limit_check_script...");
  #        if (!id(limit_check_script).is_running()) id(limit_check_script).execute();          

  - interval: 120s
    then:
      - lambda: id(can_ack_counter) = 0;                          // Reset ACK counter for test inverter ACK

  - interval: 100ms
    then:
......

When i changed the update_interval to 2s, I saw the problem is link to MQTT also

paulsteigel commented 2 months ago

Could you provide your complete YAML? I'm a bit confused because you mentioned CAN bus communication and a crash/reboot which isn't part of the log above.

The warning doesn't cause reboots normally. Could you remove the jk_bms and fake some required measurements? I guess there will be the same drop-outs also if the component isn't in place. In other words: I guess your issue is unrelated to the jk_bms component.

When I change Logger back to DEBUG, the following log is printed:

[22:07:49][W][component:237]: Component jk_modbus took a long time for an operation (406 ms).
[22:07:49][W][component:238]: Components should block for at most 30 ms.
[22:07:49][D][uart_debug:114]: <<< A0:00:46:A1:00:3C:A2:00:14:A3:00:46:A4:00:46:A5:FF:EC:A6:FF:F6:A7:FF:EC:A8:FF:F6:A9:10:AA:00:00:01:18:AB:01:AC:01:AD:03:FF:AE:01:AF:00:B0:00:0A:B1:14:B2:32:31:31:30:31:39:39:31:00:00:B3:00:B4:49:6E:70:75:74:20:55:73:B5:32:34:30:34:B6:00:00:51:BC:B7:31:31:2E:58:41:5F:53:31:31:2E:34:37:5F:5F:5F:B8:00:B9:00:00:01:18:BA:49:6E:70:75:74:20:55:73:65:72:64:61:4A:4B:5F:42:31:41:32:30:53:31:35:50:C0:01:00:00:00:00:68:00:00:57:64
[22:07:49][D][switch:012]: 'nktienvphuc Restart_ESP' Turning ON.
[22:07:49][D][switch:055]: 'nktienvphuc Restart_ESP': Sending state OFF
[22:07:49][I][restart:016]: Restarting device...
[22:07:49][I][app:132]: Rebooting safely...
[22:07:49][D][esp32.preferences:114]: Saving 2 preferences to flash...
[22:07:57][W][component:170]: Component mqtt cleared Warning flag
[22:07:57][I][mqtt:274]: MQTT Connected!
[22:08:24][W][component:170]: Component mqtt cleared Warning flag
[22:08:24][I][mqtt:274]: MQTT Connected!
[22:08:39][W][component:170]: Component mqtt cleared Warning flag
[22:08:39][I][mqtt:274]: MQTT Connected!
[22:08:52][W][component:170]: Component mqtt cleared Warning flag
[22:08:52][I][mqtt:274]: MQTT Connected!
[22:09:05][W][component:170]: Component mqtt cleared Warning flag
[22:09:05][I][mqtt:274]: MQTT Connected!

If I restart Mosquito broker, the above log disappeared and the next rounds of modbus appear and after few minutes, Can bus message starts and stops after a few times of modbus blocking message. In short: the esp32 took long time to get connected to MQTT and starts to send can messages in few minutes afterwards. After sometimes when modbus blocks occured, esp32 starts to act very strange and stop sending can message and halt! As I am away from the device, I could not check whether it reset/reboot or not! I can share the web so you can observer the config and log directly on my site. Can you share your email so I can send you the details of my HA server for you to check directly (or can you be so kind sending me an email to ngocdd@sfdp.net)! Best, Ngoc

paulsteigel commented 2 months ago

Could you provide your complete YAML? I'm a bit confused because you mentioned CAN bus communication and a crash/reboot which isn't part of the log above.

The warning doesn't cause reboots normally. Could you remove the jk_bms and fake some required measurements? I guess there will be the same drop-outs also if the component isn't in place. In other words: I guess your issue is unrelated to the jk_bms component.

Hi @Syssi This is a long log i captured from the esp32, I have to set it to Info instead of debug/verbose

[16:43:49][I][main:722]: send can id: 0x359 hex: 0 0 0 0 1 0 0 0
[16:43:49][I][main:807]: Alarm Status : NoAlarm
[16:43:49][I][main:837]: send can id: 0x351 hex: 0 4d 20 3 bc 2 1c c
[16:43:49][I][main:838]: Charge Status : Wait
[16:43:49][I][main:865]: send can id: 0x355 hex: 2e 0 0 0 f8 11 64 5
[16:43:49][I][main:890]: send can id: 0x356 hex: b7 14 ff ff 22 1 0 0
[16:43:49][I][main:912]: send can id: 0x35C hex: c0 0
[16:43:49][I][main:933]: send can id: 0x70 hex: f2 c f3 c 22 1 22 1
[16:43:50][I][main:1127]: send can id: 0x379 hex: 59 2 0 0 0 0 0 0
[16:43:50][I][jk_bms:071]: Status frame received
[16:43:50][W][component:237]: Component jk_modbus took a long time for an operation (219 ms).
[16:43:50][W][component:238]: Components should block for at most 30 ms.
[16:43:50][I][main:1171]: send can id: 0x35E ASCII : PYLON
[16:43:50][I][main:722]: send can id: 0x359 hex: 0 0 0 0 1 0 0 0
[16:43:50][I][main:807]: Alarm Status : NoAlarm
[16:43:50][I][main:837]: send can id: 0x351 hex: 0 4d 20 3 bc 2 1c c
[16:43:50][I][main:838]: Charge Status : Wait
[16:43:50][I][main:865]: send can id: 0x355 hex: 2e 0 0 0 f8 11 64 5
[16:43:51][I][main:890]: send can id: 0x356 hex: b7 14 ff ff 22 1 0 0
[16:43:51][I][main:912]: send can id: 0x35C hex: c0 0
[16:43:51][I][main:933]: send can id: 0x70 hex: f2 c f3 c 22 1 22 1
[16:43:55][I][jk_bms:071]: Status frame received
[16:43:55][W][component:237]: Component jk_modbus took a long time for an operation (237 ms).
[16:43:55][W][component:238]: Components should block for at most 30 ms.
[16:44:00][I][jk_bms:071]: Status frame received
[16:44:00][W][component:237]: Component jk_modbus took a long time for an operation (230 ms).
[16:44:00][W][component:238]: Components should block for at most 30 ms.
[16:44:05][I][jk_bms:071]: Status frame received
[16:44:05][W][component:237]: Component jk_modbus took a long time for an operation (230 ms).
[16:44:05][W][component:238]: Components should block for at most 30 ms.
[16:44:10][I][jk_bms:071]: Status frame received
[16:44:10][W][component:237]: Component jk_modbus took a long time for an operation (235 ms).
[16:44:10][W][component:238]: Components should block for at most 30 ms.
[16:44:15][I][jk_bms:071]: Status frame received
[16:44:15][W][component:237]: Component jk_modbus took a long time for an operation (231 ms).
[16:44:15][W][component:238]: Components should block for at most 30 ms.
[16:44:20][I][jk_bms:071]: Status frame received
[16:44:20][W][component:237]: Component jk_modbus took a long time for an operation (234 ms).
[16:44:20][W][component:238]: Components should block for at most 30 ms.
[16:44:25][I][jk_bms:071]: Status frame received
[16:44:25][W][component:237]: Component jk_modbus took a long time for an operation (236 ms).
[16:44:25][W][component:238]: Components should block for at most 30 ms.
[16:44:30][I][jk_bms:071]: Status frame received
[16:44:30][W][component:237]: Component jk_modbus took a long time for an operation (233 ms).
[16:44:30][W][component:238]: Components should block for at most 30 ms.
[16:44:35][I][jk_bms:071]: Status frame received
[16:44:35][W][component:237]: Component jk_modbus took a long time for an operation (233 ms).
[16:44:35][W][component:238]: Components should block for at most 30 ms.
[16:44:40][I][jk_bms:071]: Status frame received
[16:44:40][W][component:237]: Component jk_modbus took a long time for an operation (234 ms).
[16:44:40][W][component:238]: Components should block for at most 30 ms.
[16:44:45][I][jk_bms:071]: Status frame received
[16:44:45][W][component:237]: Component jk_modbus took a long time for an operation (232 ms).
[16:44:45][W][component:238]: Components should block for at most 30 ms.
[16:44:50][I][jk_bms:071]: Status frame received
[16:44:50][W][component:237]: Component jk_modbus took a long time for an operation (232 ms).
[16:44:50][W][component:238]: Components should block for at most 30 ms.
[16:44:55][I][jk_bms:071]: Status frame received
[16:44:55][W][component:237]: Component jk_modbus took a long time for an operation (231 ms).
[16:44:55][W][component:238]: Components should block for at most 30 ms.
[16:45:00][I][jk_bms:071]: Status frame received
[16:45:00][W][component:237]: Component jk_modbus took a long time for an operation (233 ms).
[16:45:00][W][component:238]: Components should block for at most 30 ms.
[16:45:05][I][jk_bms:071]: Status frame received
[16:45:05][W][component:237]: Component jk_modbus took a long time for an operation (235 ms).
[16:45:05][W][component:238]: Components should block for at most 30 ms.
[16:45:10][I][jk_bms:071]: Status frame received
[16:45:10][W][component:237]: Component jk_modbus took a long time for an operation (232 ms).
[16:45:10][W][component:238]: Components should block for at most 30 ms.
[16:45:15][I][jk_bms:071]: Status frame received
[16:45:15][W][component:237]: Component jk_modbus took a long time for an operation (233 ms).
[16:45:15][W][component:238]: Components should block for at most 30 ms.
[16:45:20][I][jk_bms:071]: Status frame received
[16:45:20][W][component:237]: Component jk_modbus took a long time for an operation (232 ms).
[16:45:20][W][component:238]: Components should block for at most 30 ms.
[16:45:25][I][jk_bms:071]: Status frame received
[16:45:25][W][component:237]: Component jk_modbus took a long time for an operation (234 ms).
[16:45:25][W][component:238]: Components should block for at most 30 ms.
[16:45:30][I][jk_bms:071]: Status frame received
[16:45:30][W][component:237]: Component jk_modbus took a long time for an operation (232 ms).
[16:45:30][W][component:238]: Components should block for at most 30 ms.
[16:45:35][I][jk_bms:071]: Status frame received
[16:45:35][W][component:237]: Component jk_modbus took a long time for an operation (235 ms).
[16:45:35][W][component:238]: Components should block for at most 30 ms.
[16:45:40][I][jk_bms:071]: Status frame received
[16:45:40][W][component:237]: Component jk_modbus took a long time for an operation (231 ms).
[16:45:40][W][component:238]: Components should block for at most 30 ms.
[16:45:45][I][jk_bms:071]: Status frame received
[16:45:45][W][component:237]: Component jk_modbus took a long time for an operation (234 ms).
[16:45:45][W][component:238]: Components should block for at most 30 ms.
[16:45:49][I][main:1127]: send can id: 0x379 hex: 59 2 0 0 0 0 0 0
[16:45:49][I][main:1171]: send can id: 0x35E ASCII : PYLON
[16:45:49][I][main:722]: send can id: 0x359 hex: 0 0 0 0 1 0 0 0
[16:45:49][I][main:807]: Alarm Status : NoAlarm
[16:45:49][I][main:837]: send can id: 0x351 hex: 0 4d 20 3 bc 2 1c c
[16:45:49][I][main:838]: Charge Status : Wait
[16:45:49][I][main:865]: send can id: 0x355 hex: 2e 0 0 0 f8 11 64 5
[16:45:50][I][main:890]: send can id: 0x356 hex: b7 14 fd ff 22 1 0 0
[16:45:50][I][main:912]: send can id: 0x35C hex: c0 0
[16:45:50][I][jk_bms:071]: Status frame received
[16:45:50][W][component:237]: Component jk_modbus took a long time for an operation (232 ms).
[16:45:50][W][component:238]: Components should block for at most 30 ms.
[16:45:50][I][main:933]: send can id: 0x70 hex: f3 c f3 c 22 1 22 1
[16:45:50][I][main:1127]: send can id: 0x379 hex: 59 2 0 0 0 0 0 0
[16:45:51][I][main:1171]: send can id: 0x35E ASCII : PYLON
[16:45:51][I][main:722]: send can id: 0x359 hex: 0 0 0 0 1 0 0 0
[16:45:51][I][main:807]: Alarm Status : NoAlarm
[16:45:51][I][main:837]: send can id: 0x351 hex: 0 4d 20 3 bc 2 1c c
[16:45:51][I][main:838]: Charge Status : Wait
[16:45:51][I][main:865]: send can id: 0x355 hex: 2e 0 0 0 f8 11 64 5
[16:45:51][I][main:890]: send can id: 0x356 hex: b7 14 ff ff 22 1 0 0
[16:45:55][I][jk_bms:071]: Status frame received
[16:45:55][W][component:237]: Component jk_modbus took a long time for an operation (233 ms).
[16:45:55][W][component:238]: Components should block for at most 30 ms.
[16:46:00][I][jk_bms:071]: Status frame received
[16:46:00][W][component:237]: Component jk_modbus took a long time for an operation (235 ms).
[16:46:00][W][component:238]: Components should block for at most 30 ms.
[16:46:05][I][jk_bms:071]: Status frame received
[16:46:05][W][component:237]: Component jk_modbus took a long time for an operation (232 ms).
[16:46:05][W][component:238]: Components should block for at most 30 ms.
[16:46:10][I][jk_bms:071]: Status frame received
[16:46:10][W][component:237]: Component jk_modbus took a long time for an operation (234 ms).
[16:46:10][W][component:238]: Components should block for at most 30 ms.
[16:46:15][I][jk_bms:071]: Status frame received
[16:46:15][W][component:237]: Component jk_modbus took a long time for an operation (240 ms).
[16:46:15][W][component:238]: Components should block for at most 30 ms.
[16:46:20][I][jk_bms:071]: Status frame received
[16:46:20][W][component:237]: Component jk_modbus took a long time for an operation (231 ms).
[16:46:20][W][component:238]: Components should block for at most 30 ms.
[16:46:25][I][jk_bms:071]: Status frame received
[16:46:25][W][component:237]: Component jk_modbus took a long time for an operation (232 ms).
[16:46:25][W][component:238]: Components should block for at most 30 ms.

At the last log, the device stop responding and I have to restart by pressing reset. The status frame is received but there may be something else causing the chip be stopped and CAN bus can not be activated for sending can message with interval setting! Can you please be so kind in helping me with thanks! Best, Ngoc