libretiny-eu / libretiny

PlatformIO development platform for IoT modules
http://docs.libretiny.eu/
MIT License
383 stars 55 forks source link

RTL8710BX - authentication error when updating OTA from 2023.5.0-dev to 2023.5.0-dev #142

Closed mihsu81 closed 11 months ago

mihsu81 commented 1 year ago

Hi @kuba2k2,

I'm getting the below error when trying to update OTA from libretiny 2023.5.0-dev to libretiny 2023.5.0-dev:

INFO Reading configuration /config/libretuya-esphome/plug-ezviz.yaml...
INFO Detected timezone 'Europe/Bucharest'
INFO Generating C++ source...
INFO Compiling app...
Processing plug-ezviz (board: generic-rtl8710bx-4mb-980k; framework: arduino; platform: https://github.com/kuba2k2/libretiny.git)
--------------------------------------------------------------------------------
HARDWARE: RTL8710BX 62MHz, 256KB RAM, 980KB Flash
 - framework-arduino-api @ 2022.8.24+sha.237b10a 
 - framework-realtek-amb1 @ 0.0.0+v2022.06.21.sha.c4e44ef 
 - library-flashdb @ 1.2.0+sha.d5c892f 
 - library-freertos @ 8.1.2+sha.776ae6c 
 - library-freertos-port @ 2023.3.13+sha.bd96e82 
 - library-lwip @ 2.1.3-amb1+sha.6297b80 
 - library-printf @ 6.1.0+sha.28a79bd 
 - tool-ltchiptool @ 4.0.0+sha.788ba4e 
PLATFORM VERSIONS:
 - libretiny @ 1.0.2+sha.5c4da6e
 - ltchiptool @ 4.2.2
CUSTOM OPTIONS:
 - fw_name = esphome
 - fw_version = 2023.5.0-dev
Dependency Graph
|-- ESPAsyncWebServer-esphome @ 3.0.0+sha.9f822c0
|   |-- AsyncTCP-esphome @ 2.0.0+sha.aab1fe4
|-- DNSServer @ 1.1.0
|-- noise-c @ 0.1.4
|   |-- libsodium @ 1.10018.1
|-- ArduinoJson @ 6.18.5
RAM:   [==        ]  20.6% (used 54068 bytes from 262144 bytes)
Flash: [=====     ]  52.3% (used 524804 bytes from 1003520 bytes)
========================= [SUCCESS] Took 9.34 seconds =========================
INFO Successfully compiled program.
INFO Resolving IP address of plug-ezviz.local
INFO  -> 192.168.45.123
INFO Uploading /data/plug-ezviz/.pioenvs/plug-ezviz/firmware.uf2 (1050624 bytes)

ERROR Error auth result: Error: Authentication invalid. Is the password correct?
ERROR Backend error code: 0x0000

I've successfully updated OTA from libretuya 2023.1.0-dev to libretiny 2023.5.0-dev. With the libretiny 2023.5.0-dev firmware I'm also not able to access the web server of the device because it resets. I'm also receiving data from it in HA but not able to control the relay.

I'm attaching the working libretuya 2023.1.0-dev firmware and faulty libretiny 2023.5.0-dev firmware. plug-ezviz_2023.1.0-dev.uf2.zip plug-ezviz_2023.5.0-dev.uf2.zip And here is the config file:

substitutions:
  devicename: plug_ezviz
  friendly_name: Plug EZVIZ
  device_description: EZVIZ T31 Smart Plug CS-T31-16B-EU - RTL8710BX-A0-CG
  current_res: "0.001" #'R001 - shunt resistor'
  voltage_div: "1891" # ('4703'x4+'01C'+'01B')/'01B' - (470Kohmx4+10Kohm+1Kohm)/1Kohm - (R_upstream + R_downstream) / R_downstream
  current_cal_meas1: "0.131"
  current_cal_real1: "0.052"
  voltage_cal_meas1: "208.8"
  voltage_cal_real1: "235.0"
  power_cal_meas1: "30.83"
  power_cal_real1: "33.9"
  power_cal_meas2: "12.28"
  power_cal_real2: "18.0"

esphome:
  name: plug-ezviz
#  friendly_name: ${friendly_name}
#  platformio_options:
#    monitor_speed: 115200
#    monitor_filters: rtl_hard_fault_decoder

#socket:
#  implementation: lwip_tcp

libretiny:
  board: generic-rtl8710bx-4mb-980k
  framework:
    version: dev

preferences:
  flash_write_interval: 1min

logger:
  level: INFO

api:
  reboot_timeout: 1h
  encryption:
    key: !secret api_encryption_key

ota:
  safe_mode: true
  password: !secret ota_password

web_server:
  port: 80
  auth:
    username: !secret username
    password: !secret password
  include_internal: true

time:
  - platform: homeassistant
    id: homeassistant_time

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password
  fast_connect: true
  reboot_timeout: 10s 
  ap:
    ssid: "${friendly_name} Fallback"
    #password: !secret esphome_fallback_password

captive_portal:

text_sensor:
  - platform: wifi_info
    ip_address:
      name: ${friendly_name} IP Address
    ssid:
      name: ${friendly_name} Connected SSID
    bssid:
      name: ${friendly_name} Connected BSSID
    mac_address:
      name: ${friendly_name} MAC Wifi Address

  - platform: version
    name: ${friendly_name} ESPHome Version

button:
  - platform: restart
    name: ${friendly_name} Restart
    id: ${devicename}_reset
    entity_category: diagnostic

  - platform: factory_reset
    name: ${friendly_name} Factory Reset
    id: ${devicename}_fatory_reset
    internal: true
    entity_category: diagnostic

  - platform: safe_mode
    name: ${friendly_name} Safe Mode Boot
    id: ${devicename}_safe_mode_boot
    entity_category: diagnostic

sensor:
  - platform: uptime
    name: ${friendly_name} Device Uptime
    id: ${devicename}_device_uptime

  - platform: wifi_signal
    name: ${friendly_name} Wifi Signal
    update_interval: 60s
    id: ${devicename}_wifi_signal

  - platform: hlw8012
#    model: HLW8012
    current_resistor: ${current_res}
    voltage_divider: ${voltage_div}
    sel_pin:
      number: PA22
      inverted: false
    cf_pin:
      number: PA18
      inverted: false
#      mode:
#        input: true
#        pullup: true
    cf1_pin:
      number: PA19
      inverted: false
#      mode:
#        input: true
#        pullup: true  
    change_mode_every: 1
    update_interval: 5s
    current:
      id: ${devicename}_amperage
      name: ${friendly_name} Amperage
      unit_of_measurement: A
      filters:
        - offset: -9.0
        - calibrate_linear:
          - 0.0 -> 0.0
          - ${current_cal_meas1} -> ${current_cal_real1}
      accuracy_decimals: 3
      icon: mdi:alpha-a-circle-outline
    voltage:
      id: ${devicename}_voltage
      name: ${friendly_name} Voltage
      filters:
        - calibrate_linear:
          - 0.0 -> 0.0
          - ${voltage_cal_meas1} -> ${voltage_cal_real1}
      unit_of_measurement: V
      accuracy_decimals: 2
      icon: mdi:alpha-v-circle-outline
    power:
      id: ${devicename}_wattage
      name: ${friendly_name} Wattage
      unit_of_measurement: W
      filters:
        - calibrate_linear:
          - 0.0 -> 0.0
#          - ${power_cal_meas1} -> ${power_cal_real1}
          - ${power_cal_meas2} -> ${power_cal_real2}
      accuracy_decimals: 2
      icon: mdi:alpha-w-circle-outline
    energy:
      id: ${devicename}_energy
      name: ${friendly_name} Energy
      unit_of_measurement: Wh
      accuracy_decimals: 3
      icon: mdi:circle-slice-2

  - platform: total_daily_energy
    id: ${devicename}_total_daily_energy
    name: ${friendly_name} Total Daily Energy
    icon: mdi:circle-slice-3
    power_id: ${devicename}_wattage
    filters:
      - multiply: 0.001
    unit_of_measurement: kWh
    accuracy_decimals: 3
    restore: true

binary_sensor:
  - platform: status
    name: "${friendly_name} Status"

  - platform: gpio
    device_class: power
    pin:
      number: PA14
      mode:
        input: true
        pullup: true
      inverted: true
    name: ${friendly_name} Power Button
    id: ${devicename}_button1
    disabled_by_default: true
    on_multi_click:
      - timing:
          - ON for at most 1s
          - OFF for at least 0.2s
        then:
          - switch.toggle: ${devicename}_led2
          - switch.toggle: ${devicename}_relay
      - timing:
          - ON for at least 5s
        then:
          - button.press: ${devicename}_reset
  - platform: gpio
    pin:
      number: PA00
      mode:
        input: true
        pullup: false
      inverted: false
    name: ${friendly_name} Reset Button
    id: ${devicename}_button2
#    on_press:
#      - switch.toggle: ${devicename}_reset

switch:
  - platform: restart
    name: "${friendly_name} Restart"
    id: ${devicename}_restart
    internal: true

  - platform: gpio
    name: ${friendly_name} Power LED
    pin: PA23
    id: ${devicename}_led2
    inverted: false
    restore_mode: RESTORE_DEFAULT_ON
    internal: true
  - platform: gpio
    name: ${friendly_name}
    pin: PA12
    id: ${devicename}_relay
    restore_mode: RESTORE_DEFAULT_ON
    on_turn_on:
      - switch.turn_on: ${devicename}_led2
    on_turn_off:
      - switch.turn_off: ${devicename}_led2

status_led:
  pin:
    number: PA05
    inverted: false

I was able to update just fine a CB2S.

kuba2k2 commented 1 year ago

Well, are you sure the password is correct? This means that the firmware currently running on the device has different OTA password than the one you're trying to upload.

mihsu81 commented 1 year ago

I'm very sure the password is correct. I'm using the same secrets.yaml for both ESPHome and LibreTiny ESPHome, and this is the only device where I have this issue. Also, when i upgraded from libretuya to libretiny OTA worked fine. Also, a wrong ota password wouldn't explain why the device reboots when I try to access its web server or when I toggle the relay.

kuba2k2 commented 1 year ago

You're right, it doesn't explain that. I'm afraid your best (only?) option would be to serial-flash it, possibly checking the logs before doing that (to find the cause of these issues). I've just upgraded my RTL8710BN from LibreTuya (built in April) to LibreTiny (built today), and then did a few following OTA updates, with no issues. However, I don't have OTA password set. But the web server (and toggling relays) works just fine, as it did before.

mihsu81 commented 1 year ago

I did flash the uf2 file (the initial one and a recreated one) over serial and the issue persists. Could it have something to do with the bootloader? My device is an EZVIZ plug, not a Tuya one. You previously had to make a few changes to make it flashable.

kuba2k2 commented 1 year ago

Did you also flash the LibreTuya one (old version)? The one that you had no problems with (from what I understand).

mihsu81 commented 1 year ago

Yes, flashing over serial the old one works fine, flashing over serial the new one causes the above mentioned issues.

kuba2k2 commented 1 year ago

Do this:

mihsu81 commented 1 year ago

Unfortunately I'm not able to get UART logs because it's a power plug on 220V. plug-ezviz_firmware.zip Even setting the logger to VERY-VERBOSE didn't produce any logs that could relate to the issue. logs.txt

kuba2k2 commented 1 year ago

Don't grab logs when it's on 220V, do it just like you're flashing it.

Power it from the flasher (or some external power supply), it should work just like it does on 220V (well, maybe the relay won't click, but apart from this there should be no difference).

mihsu81 commented 1 year ago

Thank you for the suggestion :) The error when trying to access the web server is the below after which it reboots:

[11:14:43]RTL8195A[HAL]: Hard Fault Error!!!!
[11:14:43]RTL8195A[HAL]: R0 = 0x1
[11:14:43]RTL8195A[HAL]: R1 = 0x50
[11:14:43]RTL8195A[HAL]: R2 = 0x0
[11:14:43]RTL8195A[HAL]: R3 = 0x10005bbc
[11:14:43]RTL8195A[HAL]: R12 = 0x80000000
[11:14:43]RTL8195A[HAL]: LR = 0x812f30b
[11:14:43]RTL8195A[HAL]: PC = 0x0
[11:14:43]RTL8195A[HAL]: PSR = 0x20000000
[11:14:43]RTL8195A[HAL]: BFAR = 0xe000ed38
[11:14:43]RTL8195A[HAL]: CFSR = 0x20000
[11:14:43]RTL8195A[HAL]: HFSR = 0x40000000
[11:14:43]RTL8195A[HAL]: DFSR = 0x0
[11:14:43]RTL8195A[HAL]: AFSR = 0x0
[11:14:43]RTL8195A[HAL]: PriMask 0x0
[11:14:43]RTL8195A[HAL]: BasePri 0x0
[11:14:43]RTL8195A[HAL]: SVC priority: 0x00
[11:14:43]RTL8195A[HAL]: PendSVC priority: 0xf0
[11:14:43]RTL8195A[HAL]: Systick priority: 0xf0

Here are the full logs. crash_log.txt

Reattaching also the elf and uf2. uf2 + elf.zip

P.S. Removing the OTA password allowed me to flash more firmware OTA.

hn commented 1 year ago

@kuba2k2 It looks to me like the two Solis S3 stick users with OTA problems did not initially flash the UF2 image (due to the ltchiptool problems for EMW3080) but the individual 0xB000-OTA1 and 0x100000-OTA2 images.

Could it be that in this case one has to zero-out certain memory area (the one for passwords?) so that it is initialised correctly?

kuba2k2 commented 1 year ago

There shouldn't be any difference, as the UF2 is really just a wrapper for the underlying OTA1/2 images. However, only one of them is flashed with the UF2, not both (as that would overwrite the currently running binary, in case of OTA). That shouldn't cause any issues, as the OTA1/2 firmware doesn't care about the other region.

There's also no specific memory area for passwords. It's just hardcoded somewhere in the program's code.

mihsu81 commented 11 months ago

Hi @kuba2k2, any ideea how to solve this issue?

kuba2k2 commented 11 months ago

Sorry, no idea. It works for me, but I'm not using the OTA password. If you can, try simply removing the password or updating via web_server.

mihsu81 commented 11 months ago

Sorry, no idea. It works for me, but I'm not using the OTA password. If you can, try simply removing the password or updating via web_server.

Yeah, I removed the OTA password. My question was about the crash of the web server when trying to access it. I've posted earlier the crash logs.

hn commented 11 months ago

The OTA component does challenge-response auth using an MD5 hash of pw-nonce-cnonce.

ESPHome Python CLI log:

DEBUG Auth: Nonce is 9c5b9e83a10ca30fb7daf79ad439cc55
DEBUG Auth: CNonce is f608dd3ff1d7ae65b0541c8f59bdc486
DEBUG Auth: Result is fd4cc756e588ff804e62808cdd895f20

MCU log:

[D][ota:147]: Starting OTA Update ...
[D][ota:178]: OTA features is 0x01
[D][ota:199]: Auth: Nonce is 9c5b9e83a10ca30fb7daf79ad439cc55
[D][ota:209]: Auth: Password is e6d69b938c8b96d8436ef30eb219a7e6 with length 32
[D][ota:220]: Auth: CNonce is f608dd3ff1d7ae65b0541c8f59bdc486
[D][ota:227]: Auth: Result is b1a7a51c425723ddae43a77d33c9f525
[D][ota:235]: Auth: Response is fd4cc756e588ff804e62808cdd895f20
[W][ota:242]: Auth failed! Passwords do not match!

The MCU does not correctly calculate the resulting hash (Result is b1a7a... is wrong), double checked by small python script:

m = hashlib.md5()
m.update(b"e6d69b938c8b96d8436ef30eb219a7e6") #pw
m.update(b"9c5b9e83a10ca30fb7daf79ad439cc55") #nonce
m.update(b"f608dd3ff1d7ae65b0541c8f59bdc486") #cnonce
r = m.hexdigest()
print(r)        # = fd4cc756e588ff804e62808cdd895f20

I traced the md5.add steps within ota_component.cpp: The hash of the password is correct (step 1), after the nonce is added (step 2), the hash differs, and of course then step 3 differs as well.

hn commented 11 months ago

If I use a hammer (don't know how to do it via an configure option) to replace LT_ARD_MD5_POLARSSL with LT_ARD_MD5_MBEDTLS in various places, hash calculation and OTA work:

[D][ota:178]: OTA features is 0x01
[D][ota:200]: Auth: Nonce is 4cc9b59675dcc12f450cb02531a70649
[D][ota:210]: Auth: Password is e6d69b938c8b96d8436ef30eb219a7e6 with length 32
[D][ota:225]: Auth: CNonce is f608dd3ff1d7ae65b0541c8f59bdc486
[D][ota:232]: Auth: Result is a20fcc5cbd78886d5517aabd368753f5
[D][ota:240]: Auth: Response is a20fcc5cbd78886d5517aabd368753f5
[D][ota:268]: OTA size is 1263104 bytes

I am still confused why this error occurs. Calculating the hash value of the password or nonce alone seems to work, also different consecutive combinations of md5.add give a correct result, but the OTA password check does not.

kuba2k2 commented 11 months ago

I thought it's a problem with the md5 impl, I haven't had time today to try and fix it. The only place you need to change that define is in lt_defs.h. If changing it there from polarssl to mbedtls works, it's a fix.

It might be caused by memory allocation of the md5 context... I'm not sure how that works exactly.

hn commented 11 months ago

I pushed https://github.com/hn/ginlong-solis/commit/2b9af93930eaaf06a4d3d1bbe940f7a23969b7d5 as a preliminary fix. Would be nice to know why Polar SSL does not work correctly.

kuba2k2 commented 11 months ago

It's probably because parts of it belong in the ROM area (which probably uses an old version or something). In LT, both implementations call the same functions from PolarSSL and mbedTLS, appropriately (polar was renamed to mbed a long time ago).

Since it fixes the issue (we don't want to use polar anyway; everything else like WiFiClientSecure uses mbed already), you can create a PR for that if you'd like.

hn commented 11 months ago

Thanks, I'll double-check with a clean install and create a PR then.

kuba2k2 commented 11 months ago

Fixed in #156. Thanks @hn for the fix!