espressif / ESP8266_RTOS_SDK

Latest ESP8266 SDK based on FreeRTOS, esp-idf style.
http://bbs.espressif.com
Apache License 2.0
3.35k stars 1.57k forks source link

watchdag 被异常触发,触发后系统宕机,没有重启。 (GIT8266O-72) #538

Closed huozhe0930 closed 5 years ago

huozhe0930 commented 5 years ago

最新RTOS_SDK版本(V3.2),只启动两个任务,

  1. i2c,循环读5个i2c接口器件,delay 1s
  2. sntp,循环获取网络时间, delay1s

wifi设置如下: ESP_ERROR_CHECK( esp_wifi_set_mode(WIFI_MODE_STA) ); ESP_ERROR_CHECK( esp_wifi_set_ps(WIFI_PS_MAX_MODEM) ); esp_wifi_set_max_tx_power(8); 我的问题有2个: 1、重启前一直运行正常,delay 1s,足够idle任务清狗,为什么会宕机呢? 2、watchdog的作用是在系统异常后重启系统,为什么会宕机,而不是重启系统? ----打印完stack dump,再没有任何信息从串口上来了,串口灯一直灭的状态 所以判断是宕机了。watchdog触发后,没有重启系统。

串口打印信息如下:

 I (407777) sntp: The curren悂date/time in Shanghai is: Mon Apr 1 00:14:29 2019 I (407787#毠蚜Free heap size: 71012  I (408767) i2c_task: ======measurement result====== I (408767) i2c_task: port:[0] type:[TMP112] temp[249] I (408777) i2c_task: port:[0] type:[TMP112] temp[249] I (408777) i2c_task: port:[0] type:[TMP112] temp[249] I (408787) i2c_task: port:[0] type:[TMP112] temp[251] I (408797) i2c_task: port:[0] type:[HDC1080] temp[258], humi[690] I (408807) i2c_task: start time:[73443567]->[73444035] measure time:[73449051]->[73449664]

 I (408827) sntp: The current date/time in Shanghai is: Mon Apr 1 00:14:30 2019 I (408837) sntp: Free heap size: 71012  I (409817) i2c_task: ======measurement result====== I (409817) i2c_task: port:[0] type:[TMP112] temp[249] I (409827) i2c_task: port:[0] type:[TMP112] temp[249] I (409827) i2c_task: port:[0] type:[TMP112] temp[249] I (409837) i2c_task: port:[0] type:[TMP112] temp[251] I (409847) i2c_task: port:[0] type:[HDC1080] temp[258], humi[690] I (409857) i2c_task: start time:[73624470]->[73624940] measure time:[73630442]->[73631070]

 I (409877) sntp: The currenV褧絫ime in Shanghai is: Mon Apr 1 00:14:31 2019 I (409887) 蛆嬃Free heap size: 71012 

Task watchdog got triggered.

Task stack [IDLE] stack from [0x3ffef6f0] to [0x3ffef9ec], total [768] size

               0          4          8          c         10         14         18         1c         20         24         28         2c         30         34         38         3c 

3ffef980 0x40254f0b 0x00017c09 0x3ffe9cd4 0x40254ecd 0x00000142 0x000041d7 0x00000001 0x4024a4bc 0x000000e5 0x0000001e 0x000001e0 0x00000000 0x00000000 0x00000000 0x00000000 0x4025549c 3ffef9c0 0x000000e5 0x0000001e 0x000001e0 0x40226f7c 0x00000000 0x00000000 0x00000000 0x40227e69 0x00000000 0x00000000 0x00000000 0x00000000

    PC: 0x401051c1        PS: 0x00000030        A0: 0x40254f0b        A1: 0x3ffef980
    A2: 0x00493b57        A3: 0x3ffe9bcc        A4: 0x00000000        A5: 0x00000009
    A6: 0x00001eb0        A7: 0x00000023        A8: 0x4024a4bc        A9: 0x000000e5
   A10: 0x0000001e       A11: 0x000001e0       A12: 0x3ffe9cd4       A13: 0x000c3257
   A14: 0x00017c09       A15: 0x3ffe9bd4       SAR: 0x00000018  EXCCAUSE: 0x40254f0b
huozhe0930 commented 5 years ago

ESP_ERROR_CHECK( esp_wifi_set_ps(WIFI_PS_MAX_MODEM) ); esp_wifi_set_max_tx_power(8); ---这两条代码是今天增加的,没有增加这两天代码前,最长持续跑过12小时以上,没发生过出发watchdog的现象。 watchdog被触发,系统为什么不重启?

huozhe0930 commented 5 years ago

昨晚一直在运行,刚启动电脑,发现不知什么时候,又宕机了。 我的问题再重新描述一下: 1、为什么会触发watch dog? 2、触发watch dog,为什么没有重启?

huozhe0930 commented 5 years ago

又宕机了,问题可以重现的,这次大概运行了10分钟后宕机

[0;32mI (4360917) sntp: The currentate/time in Shanghai is: Mon Apr 1 10:58:27 2019 I (436092嶫毠蚜Free heap size: 71012  I (4361917) i2c_task: ======measurement result====== I (4361917) i2c_task: port:[0] type:[TMP112] temp[233] I (4361927) i2c_task: port:[0] type:[TMP112] temp[233] I (4361927) i2c_task: port:[0] type:[TMP112] temp[233] I (4361937) i2c_task: port:[0] type:[TMP112] temp[234] I (4361947) i2c_task: port:[0] type:[HDC1080] temp[242], humi[680] I (4361957) i2c_task: start time:[791410722]->[791411196] measure time:[791416268]->[791416888]

 I (4361977) sntp: The current寘褧絫ime in Shanghai is: Mon Apr 1 10:58:28 2019 I (4361987 寡灵 Free heap size: 71012 

Task watchdog got triggered.

Task stack [IDLE] stack from [0x3ffef6f0] to [0x3ffef9ec], total [768] size

               0          4          8          c         10         14         18         1c         20         24         28         2c         30         34         38         3c 

3ffef980 0x40254f0b 0x00017c3a 0x3ffe9cd4 0x40254ecd 0x00000143 0x000042a7 0x00000001 0x00000000 0x000c3500 0x0000001e 0x000001e0 0x00000000 0x00000000 0x00000000 0x00000000 0x4025549c 3ffef9c0 0x00000000 0x00000000 0x00000000 0x40226f7c 0x00000000 0x00000000 0x00000000 0x40227e69 0x00000000 0x00000000 0x00000000 0x00000000

    PC: 0x401051c1        PS: 0x00000030        A0: 0x40254f0b        A1: 0x3ffef980
    A2: 0x00492a28        A3: 0x3ffe9bcc        A4: 0x00000000        A5: 0x00000009
    A6: 0x000081f2        A7: 0x00000023        A8: 0x00000000        A9: 0x000c3500
   A10: 0x0000001e       A11: 0x000001e0       A12: 0x3ffe9cd4       A13: 0x000c2128
   A14: 0x00017c3a       A15: 0x3ffe9bd4       SAR: 0x00000018  EXCCAUSE: 0x40254f0b
huozhe0930 commented 5 years ago

wifi连接一直是正常的,disconnect事件,我都有打印信息上来。没发现disconnect的打印信息上来。 case SYSTEM_EVENT_STA_DISCONNECTED: ESP_LOGI(TAG, "!!connect is lost!\n"); if(reconnectCounter++>50) { esp_wifi_stop(); reconnectCounter = 0; esp_wifi_start(); break; } else { esp_wifi_connect();

    }
    xEventGroupClearBits(wifi_event_group, CONNECTED_BIT);
    break;
huozhe0930 commented 5 years ago

100%重现该问题,时间长短不一。我看了下文档,这个狗不是硬件狗,但是我根本没有用到这个狗,难道是自动启动的?看门狗的作用不是重启系统?有遇到类似问题的帮忙解惑一下吗?

huozhe0930 commented 5 years ago

这次宕机,打印信息有所不同: Core 0 was running in ISR context:

               0          4          8          c         10         14         18         1c         20         24         28         2c         30         34         38         3c 

3ffe84a4 0x00000001 0x0000000b 0x401002cc 0x40100490 0x00000003 0x4025bd34 0x401004a4 0x00000004 0x00000003 0x4025bd34 0x000c3500 0x3ffe9bf4 0x000c3392 0xffffffc0 0x40105234 0x3ffefa40 PC: 0x00000000 PS: 0x00000033 A0: 0x401004a4 A1: 0x3ffe84c0 A2: 0x00000000 A3: 0x00000000 A4: 0xffc2f96e A5: 0x001212d7

    A6: 0x00000000        A7: 0x00000023        A8: 0x4024a9c4        A9: 0x000000eb
   A10: 0x0000001e       A11: 0x000001e0       A12: 0x3ffee764       A13: 0x00000100
   A14: 0x0000ffc0       A15: 0x00000001       SAR: 0x00000018  EXCCAUSE: 0x00000014
donghengqaz commented 5 years ago

你把 configMINIMAL_STACK_SIZE 改成 1280 试试。

huozhe0930 commented 5 years ago

我可以改一下试试。不过我根本没有动态申请内存、释放内存的操作啊。 变量也很少,就一个不到500字节的数组,其余都是零散的非常少的变量。

huozhe0930 commented 5 years ago

这个500字节的数组,是文件作用域(带static作用域前缀的全局变量),我的理解,应该分配在堆中,而不是任务的栈中。不知道理解的对不对?

huozhe0930 commented 5 years ago

configMINIMAL_STACK_SIZE 改成 1280也没有作用, 仍然宕机。

 I (890691) sntp: The current date/time in Shanghai is: Thu Apr 4 00:22:47 2019 I (890701) sntp: Free heap size: 60040  I (892691) i2c_task: ======measurement result====== I (892691) i2c_task: port:[0] type:[TMP112] temp[252] I (892701) i2c_task: port:[0] type:[TMP112] temp[252] I (892701) i2c_task: port:[0] type:[TMP112] temp[252] I (892711) i2c_task: port:[0] type:[TMP112] temp[253] I (892721) i2c_task: port:[0] type:[HDC1080] temp[261], humi[690] I (892731) i2c_task: start time:[159911564]->[159912047] measure time:[159917073]->[159917746]

 I (892751) sntp: The current date/time in Shanghai is: Thu Apr 4 00:22:49 2019 I (892761) sntp: Free heap size: 60040 

Core 0 was running in ISR context:

               0          4          8          c         10         14         18         1c         20         24         28         2c         30         34         38         3c 

3ffe84a4 0x00000001 0x0000000a 0x401002cc 0x40100490 0x00000003 0x4025e5a8 0x401004a4 0x00000004 0x00000003 0x4025e5a8 0x000c3500 0x3ffe9bfc 0x000c311c 0xffffffc0 0x40105234 0x3ffefda0 PC: 0x00000000 PS: 0x00000033 A0: 0x401004a4 A1: 0x3ffe84c0 A2: 0x00000000 A3: 0x00000000 A4: 0xffc2fbe4 A5: 0x00015cdd A6: 0x00000000 A7: 0x00000023 A8: 0x4024d1f4 A9: 0x000000ea A10: 0x0000001e A11: 0x000001e0 A12: 0x3ffee798 A13: 0x00000100 A14: 0x0000ffc0 A15: 0x00000001 SAR: 0x00000018 EXCCAUSE: 0x00000014

donghengqaz commented 5 years ago

能提供 .elf 和对应的 log 吗?

donghengqaz commented 5 years ago

idle task stack 溢出的问题在最新的更新里面已经修复了。