espressif / esp-idf

Espressif IoT Development Framework. Official development framework for Espressif SoCs.
Apache License 2.0
13.63k stars 7.28k forks source link

Unable To Join ZigBee network (IDFGH-10161) #11432

Closed Mikeyu1234 closed 1 year ago

Mikeyu1234 commented 1 year ago

Answers checklist.

IDF version.

v5.2-dev-544-g54576b7528

Operating System used.

macOS

How did you build your project?

Command line with idf.py

If you are using Windows, please specify command line type.

None

Development Kit.

ESP32-C6-WROOM-1

Power Supply used.

USB

What is the expected behavior?

I have two boards, run HA_on_off_light, trying to join a zigbee network.

What is the actual behavior?

Zigbee hub can not find esp32. And the log shows Network steering was not successful. I did full erase before reflashing firmware but still not work

image

Steps to reproduce.

  1. cd examples/zigbee/light_sample/HA_on_off_light
  2. idf.py set-target esp32c6
  3. idf.py build
  4. idf.py erase-flash flash monitor

Debug Logs.

I (23) boot: ESP-IDF v5.2-dev-544-g54576b7528 2nd stage bootloader
I (23) boot: compile time May 18 2023 21:43:47
I (24) boot: chip revision: v0.0
I (27) boot.esp32c6: SPI Speed      : 40MHz
I (32) boot.esp32c6: SPI Mode       : DIO
I (37) boot.esp32c6: SPI Flash Size : 2MB
I (42) boot: Enabling RNG early entropy source...
W (47) bootloader_random: bootloader_random_enable() has not been implemented yet
I (55) boot: Partition Table:
I (59) boot: ## Label            Usage          Type ST Offset   Length
I (66) boot:  0 nvs              WiFi data        01 02 00009000 00006000
I (73) boot:  1 phy_init         RF data          01 01 0000f000 00001000
I (81) boot:  2 factory          factory app      00 00 00010000 00089800
I (88) boot:  3 zb_storage       Unknown data     01 81 0009a000 00004000
I (96) boot:  4 zb_fct           Unknown data     01 81 0009e000 00000400
I (103) boot: End of partition table
I (108) esp_image: segment 0: paddr=00010020 vaddr=42058020 size=0d200h ( 53760) map
I (127) esp_image: segment 1: paddr=0001d228 vaddr=40800000 size=02df0h ( 11760) load
I (131) esp_image: segment 2: paddr=00020020 vaddr=42000020 size=56158h (352600) map
I (205) esp_image: segment 3: paddr=00076180 vaddr=40802df0 size=0c2f8h ( 49912) load
I (220) boot: Loaded app from partition at offset 0x10000
I (220) boot: Disabling RNG early entropy source...
W (221) bootloader_random: bootloader_random_enable() has not been implemented yet
I (240) cpu_start: Unicore app
I (240) cpu_start: Pro cpu up.
W (249) clk: esp_perip_clk_init() has not been implemented yet
I (255) cpu_start: Pro cpu start user code
I (256) cpu_start: cpu freq: 160000000 Hz
I (256) cpu_start: Application information:
I (259) cpu_start: Project name:     light_bulb
I (264) cpu_start: App version:      v5.2-dev-544-g54576b7528
I (270) cpu_start: Compile time:     May 18 2023 21:43:43
I (276) cpu_start: ELF file SHA256:  11781f1521001807...
I (282) cpu_start: ESP-IDF:          v5.2-dev-544-g54576b7528
I (289) cpu_start: Min chip rev:     v0.0
I (293) cpu_start: Max chip rev:     v0.99 
I (298) cpu_start: Chip rev:         v0.0
I (303) heap_init: Initializing. RAM available for dynamic allocation:
I (310) heap_init: At 408153B0 len 00067260 (412 KiB): D/IRAM
I (317) heap_init: At 4087C610 len 00002F54 (11 KiB): STACK/DIRAM
I (323) heap_init: At 50000010 len 00003FF0 (15 KiB): RTCRAM
I (330) spi_flash: detected chip: generic
I (334) spi_flash: flash io: dio
W (338) spi_flash: Detected size(8192k) larger than the size in the binary image header(2048k). Using the size in the binary image header.
I (351) sleep: Configure to isolate all GPIO pins in sleep state
I (358) sleep: Enable automatic switching of GPIO sleep configuration
I (365) coexist: coex firmware version: ebddf30
I (371) coexist: coexist rom version 5b8dcfa
I (376) app_start: Starting scheduler on CPU0
I (380) main_task: Started on CPU0
I (380) main_task: Calling app_main()
I (390) gpio: GPIO[8]| InputEn: 0| OutputEn: 1| OpenDrain: 0| Pullup: 1| Pulldown: 0| Intr:0 
I (390) phy_init: phy_version 200,d1caf30,Apr 10 2023,17:19:22
W (400) phy_init: failed to load RF calibration data (0x1102), falling back to full calibration
I (470) main_task: Returned from app_main()
I (570) ESP_ZB_ON_OFF_LIGHT: ZDO signal: 23, status: -1
I (570) ESP_ZB_ON_OFF_LIGHT: Zigbee stack initialized
I (570) ESP_ZB_ON_OFF_LIGHT: Start network steering
I (3230) ESP_ZB_ON_OFF_LIGHT: Network steering was not successful (status: -1)
I (6900) ESP_ZB_ON_OFF_LIGHT: Network steering was not successful (status: -1)
I (10570) ESP_ZB_ON_OFF_LIGHT: Network steering was not successful (status: -1)
I (14240) ESP_ZB_ON_OFF_LIGHT: Network steering was not successful (status: -1)

More Information.

This issue is similar to #10662 Where it happened on ON_OFF_SWITCH demo this happened on LIGHT demo.

Mikeyu1234 commented 1 year ago

Additionally I found this issue. ESP32-C6 HA_on_off_light example not connecting to zigbee network. (IDFGH-10027) #11304 It states there is a driver issue. But I used the latest idf v5.2-dev-544-g54576b7528, does the problem got reintroduced?

chshu commented 1 year ago

Which coodinator are you using for your test?

The HA_on_off_switch is the coodinator in our examples, could you try run HA_on_off_switch on a C6 board, and then run HA_on_off_light on a second C6, then the light should be able to join the zigbee network formed by the switch.

Mikeyu1234 commented 1 year ago

Hi Chshu, Thanks for replying. I'm trying to use a third-party Zigbee hub as a coordinator. Does this log suggest C6 currently does not support steering Zigbee channel and I have to manually set it?

chshu commented 1 year ago

@Mikeyu1234 Do you know whether the third-party Zigbee hub supports the pre-configured global link key ZigbeeAlliance09 or not? Our examples use this link-key by default.

Some Zigbee 3.0 Hubs only support the install code way which is more secure, in this case, the Zigbee devices can not join these Hubs without some pre-steps like scanning the QR code.

ildus commented 1 year ago

I have the same problem now on the master branch. Checked using two esp32-h2-mini-1 modules. Operating system is Linux.

chshu commented 1 year ago

@ildus Could you try the HA_on_off_switch and HA_on_off_light examples on two H2 modules? The two example should work as is.

To join the other third-party Zigbee Hub, you need to check the security policy the coordinator supports:

ildus commented 1 year ago

@chshu yes, the problem was on HA_on_off_switch and HA_on_off_light. But eventually it started to work (I think erase-flash helped). There is a still a some error on light example, it restarts first time, but after reboot it connects to the second module. The error info:

I (318) spi_flash: detected chip: generic
I (322) spi_flash: flash io: dio
W (326) spi_flash: Detected size(4096k) larger than the size in the binary image header(2048k). Using the size in the binary image header.
I (339) app_start: Starting scheduler on CPU0
I (344) main_task: Started on CPU0
I (344) main_task: Calling app_main()
W (354) rmt: channel resolution loss, real=10666666
I (354) gpio: GPIO[8]| InputEn: 0| OutputEn: 1| OpenDrain: 0| Pullup: 1| Pulldown: 0| Intr:0
I (384) phy: phy_version: 200,0, 1cef4f4, May 22 2023, 11:57:13

assert failed: ieee802154_isr esp_ieee802154_dev.c:463 (s_ieee802154_state == IEEE802154_STATE_RX || s_ieee802154_state == IEEE802154_STATE_RX_ACK || s_ieee802154_state == IEEE802154_STATE_TX || s_ie
Core  0 register dump:
Stack dump detected
MEPC    : 0x4080043c  RA      : 0x40806b2e  SP      : 0x4080e130  GP      : 0x4080cd80
0x4080043c: panic_abort at /Users/mkabilov/esp/esp-idf/components/esp_system/panic.c:452

0x40806b2e: __ubsan_include at /Users/mkabilov/esp/esp-idf/components/esp_system/ubsan.c:313

TP      : 0x40842578  T0      : 0x54535f34  T1      : 0x65656569  T2      : 0x35313230
S0/FP   : 0x00000000  S1      : 0x0000008f  A0      : 0x4080e16c  A1      : 0x42064923
A2      : 0x73207c7c  A3      : 0x00000065  A4      : 0x00000001  A5      : 0x40814000
A6      : 0x00000020  A7      : 0x31323038  S2      : 0x00000007  S3      : 0x4080e307
S4      : 0x42064894  S5      : 0x00000000  S6      : 0x00000000  S7      : 0x00000000
S8      : 0x00000000  S9      : 0x4084f4fc  S10     : 0x00000000  S11     : 0x00000000
T3      : 0x5f73207c  T4      : 0x7c204b43  T5      : 0x415f5852  T6      : 0x5f455441
MSTATUS : 0x00001881  MTVEC   : 0x40800001  MCAUSE  : 0x00000007  MTVAL   : 0x00000000
0x40800001: _vector_table at ??:?

MHARTID : 0x00000000

Backtrace:

/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/install_name_tool: warning: changes being made to the file will invalidate the code signature in: /Users/mkabilov/.espressif/tools/riscv32-esp-elf-gdb/12.1_20221002/riscv32-esp-elf-gdb/bin/riscv32-esp-elf-gdb-3.11.CmtshaK
panic_abort (details=details@entry=0x4080e16c <xIsrStack+1244> "assert failed: ieee802154_isr esp_ieee802154_dev.c:463 (s_ieee802154_state == IEEE802154_STATE_RX || s_ieee802154_state == IEEE802154_STATE_RX_ACK || s_ieee802154_state == IEEE802154_STATE_TX || s_ie") at /Users/mkabilov/esp/esp-idf/components/esp_system/panic.c:452
452     *((volatile int *) 0) = 0; // NOLINT(clang-analyzer-core.NullDereference) should be an invalid operation on targets
#0  panic_abort (details=details@entry=0x4080e16c <xIsrStack+1244> "assert failed: ieee802154_isr esp_ieee802154_dev.c:463 (s_ieee802154_state == IEEE802154_STATE_RX || s_ieee802154_state == IEEE802154_STATE_RX_ACK || s_ieee802154_state == IEEE802154_STATE_TX || s_ie") at /Users/mkabilov/esp/esp-idf/components/esp_system/panic.c:452
#1  0x40806b2e in esp_system_abort (details=details@entry=0x4080e16c <xIsrStack+1244> "assert failed: ieee802154_isr esp_ieee802154_dev.c:463 (s_ieee802154_state == IEEE802154_STATE_RX || s_ieee802154_state == IEEE802154_STATE_RX_ACK || s_ieee802154_state == IEEE802154_STATE_TX || s_ie") at /Users/mkabilov/esp/esp-idf/components/esp_system/port/esp_system_chip.c:90
#2  0x4080b5ca in __assert_func (file=file@entry=0x42064707 "", line=line@entry=463, func=<optimized out>, func@entry=0x42064d24 <__func__.3> "", expr=expr@entry=0x42064894 "") at /Users/mkabilov/esp/esp-idf/components/newlib/assert.c:81
#3  0x4080ae34 in ieee802154_isr (arg=arg@entry=0x0) at /Users/mkabilov/esp/esp-idf/components/ieee802154/driver/esp_ieee802154_dev.c:463
#4  0x4080b6d4 in _global_interrupt_handler (sp=<optimized out>, mcause=<optimized out>) at /Users/mkabilov/esp/esp-idf/components/riscv/interrupt.c:57
#5  0x408001ec in _interrupt_handler ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
ELF file SHA256: d39e17e56
chshu commented 1 year ago

@ildus The erase-flash do helps to remove the previous network info, as we recommended in the example readme.

Regarding the assert issue, we will try to reproduce, did you encounter this issue everytime or randomly?

nomis commented 1 year ago

I'm now getting this assert repeatedly on startup. There are several places where s_ieee802154_state is updated without preventing the interrupt handler from running.

I've created #12024 for this.

chshu commented 1 year ago

The origin Zigbee join issue is clear. Follow the 15.4 assert issue in #12024.