espressif / arduino-esp32

Arduino core for the ESP32
GNU Lesser General Public License v2.1
13.63k stars 7.41k forks source link

ESP32 + LAN8720 random reboots (arp_table[i].q == NULL) #6182

Open bongoo1 opened 2 years ago

bongoo1 commented 2 years ago

Board

ESP32 Dev Module + LAN8720

Device Description

ESP32 Dev Module with LAN8720 attached to RMII and TFT display attached to SPI.

Hardware Configuration

no

Version

v1.0.6

IDE Name

platformIO

Operating System

windows10

Flash frequency

default

PSRAM enabled

no

Upload speed

115200

Description

my application uses the LAN8720 to connect to other ethernet devices. WiFi is not used. my application is a TCP client to 5 devices. the 5 devices are talkers-only, i.e. there is only incoming traffic from them. in addition to this, i run NTP using 1 UDP connection. there is a second UDP connection i use for communication. on this UDP, i have outgoing packets, which are generated from the data i get over TCP. this UDP connection also has incoming traffic (very rarely) as i also use it to apply settings to my application. my application also uses ping to check if all devices expected are available in the subnet. so my total of ethernet connections is: 5x TCP, 2x UDP, 1x ICMP i know that there is a limitation of having not more than 8 TCP connections when running ESP32+LAN8720. what is the limitation when using TCP + UDP + ICMP? is it a total of 8 ethernet connections?

when running in the above confituration, i get a SW_CPU_RESET (see below) every few hours.

this looks to me like being some kind of resource issue, but i have no idea how to solve. all my code is within the setup and the loop function. so i would assume that only core1 should be used. nevertheless, it looks like the reset is initiated from core0. this looks quite strange to me.

btw: i first had a constellation with 5x TCP, 3x UDP, 1x ICMP, i.e. a total of 9 connections. the result was that i had the same crash (exactly the same crash message) already after about 10 minutes. so if the limit is a total of 8 ethernet connections (i dont know if this is true), then probably this was the first time when all 9 connections were active at the same time.

as my actual setup is never able to use more than 8 ethernet connections (as far as i understand), i would not assume that it is ever possible to have more than 8 active connections. but the error message is the same

Sketch

my setup is:

[env:esp32dev]
platform = espressif32
board = esp32dev
framework = arduino
monitor_speed = 115200
lib_deps = 
    bodmer/TFT_eSPI@^2.3.73
    sstaub/NTP@^1.4
    marian-craciunescu/ESP32Ping@^1.7

instantiated:

#include <Arduino.h>
#include <WiFi.h>
#include <TFT_eSPI.h> // Hardware-specific library
#include <SPI.h>
#include "Free_Fonts.h" // Include the header file attached to this sketch
#include <WiFiUdp.h> 
#include <ETH.h>
#include "NTP.h"
#include <ESP32Ping.h>
#include "mbsConfStorage.h"
#include <rom/rtc.h>

Debug Message

assertion "arp_table[i].q == NULL" failed: file "/home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/lwip/lwip/src/core/ipv4/etharp.c", line 383, function: etharp_find_entry
abort() was called at PC 0x400f7c17 on core 0

ELF file SHA256: 0000000000000000

Backtrace: 0x40088808:0x3ffb3cf0 0x40088a85:0x3ffb3d10 0x400f7c17:0x3ffb3d30 0x4012fef9:0x3ffb3d60 0x401304ed:0x3ffb3d90 0x40130811:0x3ffb3db0 0x40121b56:0x3ffb3de0 0x40121b99:0x3ffb3e20 0x40121bc6:0x3ffb3e50 0x4012dbad:0x3ffb3e80 0x4012dc57:0x3ffb3eb0 0x40129516:0x3ffb3ee0 0x4012956f:0x3ffb3f00 0x4012a033:0x3ffb3f20 0x40129f21:0x3ffb3f40 0x4012a0d4:0x3ffb3f60 0x40126ad8:0x3ffb3f80 0x40089a96:0x3ffb3fb0

Rebooting...
ets Jun 8 2016 00:22:57

rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0018,len:4
load:0x3fff001c,len:1044
load:0x40078000,len:10124
load:0x40080400,len:5828
entry 0x400806a8

Other Steps to Reproduce

with a maximum total of 8 ethernet connections, the crash occurs about 1-2 times a day, always immediately after calling the ping function (as this is probably the 8th ethernet connection. at least i assume that ping always opens and closes the connection, while the TCP and UDP connections are once opened and are then kept open until the crash happens). so there are always 7 connections with the ping adding/removing the 8th.

when having a maximum total of 9 ethernet connections. the crash may happen while program execution is anywhere in my code. this then happens about 6 times per hour.

I have checked existing issues, online documentation and the Troubleshooting Guide

me-no-dev commented 2 years ago

could you please try with 2.0.2. 1.0.6 is quite old now :)

bongoo1 commented 2 years ago

hm. maybe i misunderstand something: when looking at the available updates for platformio, i see that i have espressif 32 framework 3.3.2 while there is a 3.4 available. but according to the release notes, arduino framework 1.0.6 was introduced with espressif 32 framework 3.2. but as far as i see in the release notes, also the newest 3.4 still uses 1.0.6. so how should i switch to 2.0.2 without running into new troubles? i also can't see how to configure something like that per project.

me-no-dev commented 2 years ago

Check this out: https://github.com/espressif/arduino-esp32/pull/5540/files#diff-63fe35f38f92e121494311c5010a4244d71d596a084875160cce6cd7f540fef5R82-R113 You will be running the latest code, not the release, but this is as much as we can do until PIO guys add support.

bongoo1 commented 2 years ago

ok, i tried to do so. on compilation, it now looks like (see below).

don't know if this is correct.

but what i know is, that my application does not work anymore at all. after downloading the new code to the esp32, i do not get an ethernet connection anymore on the lan8720.

from the network, i can still ping the lan8720 and get an answer, but from the esp32, i just get the information that ethernet cannot connect.

is there anything else i need to change?

Processing esp32dev (platform: https://github.com/platformio/platform-espressif32.git#feature/arduino-upstream; board: esp32dev; framework: arduino) ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Verbose mode can be enabled via -v, --verbose option CONFIGURATION: https://docs.platformio.org/page/boards/espressif32/esp32dev.html PLATFORM: Espressif 32 (3.3.1+sha.3784198) > Espressif ESP32 Dev Module HARDWARE: ESP32 240MHz, 320KB RAM, 4MB Flash DEBUG: Current (esp-prog) External (esp-prog, iot-bus-jtag, jlink, minimodule, olimex-arm-usb-ocd, olimex-arm-usb-ocd-h, olimex-arm-usb-tiny-h, olimex-jtag-tiny, tumpa) PACKAGES:

bongoo1 commented 2 years ago

with the platformio.ini modified to

[env:esp32dev] platform = https://github.com/platformio/platform-espressif32.git#feature/arduino-upstream board = esp32dev framework = arduino platform_packages = framework-arduinoespressif32 @ https://github.com/espressif/arduino-esp32#master monitor_speed = 115200 lib_deps = bodmer/TFT_eSPI@^2.3.73 sstaub/NTP@^1.4 marian-craciunescu/ESP32Ping@^1.7

the ethernet (with lan8720) does not connect anymore. what do i need to change to fix this, or how can i go back to the original platformio version i had before?

bongoo1 commented 2 years ago

ETH CONNECTION: looks like i found the issue with the non-working eth connection after updating to 2.0.2. the names of the event in the connection state machine void WiFiEvent(WiFiEvent_t event) have changed. by replacing the prefix "SYSTEM_EVENTETH" with "ARDUINO_EVENTETH", the LAN8720 connects again.

STABILITY: the eth connection seems to be much more stable than it has been with 1.0.6. with the code version that rebooted about every 10 minutes with 1.0.6, the application is running now for almost a day without a reboot.

QUESTION: to update to 2.0.2, i had to replace platform = espressif32 with platform = https://github.com/platformio/platform-espressif32.git#feature/arduino-upstream i assume that the former setting just used the preinstalled version, while the new setting takes the latest version from the internet, each time i compile. right? and what does the

feature/arduino-upstream

added to the url mean?

the second change was adding platform_packages = framework-arduinoespressif32 @ https://github.com/espressif/arduino-esp32#master i had no platform_packages set before. so i assume that it just took the one installed with platformio, while it now gets the latest version from the internet? and what does the

master

mean?

as far as i understand, the whole framework and libraries used are now quite dynamic, i.e. may change from compilation to compilation, whenever any changes are done to the framework and libraries in the git. so if i would like to freeze the settings to use the actual framework / libraries, to avoid impact from any updates if i have to recompile in a few month, how can i change the settings that the same versions are used as today? i.e. to use a tagged version?

me-no-dev commented 2 years ago

you can not do that currently with PlatformIO as far as I am aware. They need to provide proper support for 2.0.x first. You might be able to go into your framework-arduinoespressif32 folder and use git to select the tag

Jason2866 commented 2 years ago

@bongoo1 That this changes are working you have to delete the hidden folder .platformio.

khoih-prog commented 2 years ago

You can try my wrapper library WebServer_WT32_ETH01, which takes care of the breaking changes in core v2.0.0+.

ETH CONNECTION: looks like i found the issue with the non-working eth connection after updating to 2.0.2. the names of the event in the connection state machine void WiFiEvent(WiFiEvent_t event) have changed. by replacing the prefix "SYSTEM_EVENTETH" with "ARDUINO_EVENTETH", the LAN8720 connects again.

Check Important Notes

VojtechBartoska commented 2 years ago

Hello, can you please retest this on v2.0.3-rc1?

bongoo1 commented 2 years ago

@VojtechBartoska , what do you mean with "this"?

in the mean time, i played around with the platform and platform_packages settings in the platformio.ini file, still not knowing which of the settings i really should use. whatever i do there, i always get hundreds of warnings i ignored so far, as i do not know if they have any relevance. at least, when compiling, i can see that it seems to use 2.0.3 now, but don't know if this is rc1. is there a recommendation on how the platform and platform_packages settings should look like to use an actual and stable version?

VojtechBartoska commented 2 years ago

@bongoo1 Take a look here please: https://github.com/espressif/arduino-esp32/issues/6044#issuecomment-1094430541

by this I meant if you can retest your sketch which ends with an issue on v2.0.3-rc1. Probably you can do so after you solve PlatformIO workaround.

bongoo1 commented 2 years ago

hi @VojtechBartoska i configured the platformio.ini according to https://github.com/espressif/arduino-esp32/issues/6044#issuecomment-1094430541

unfolrtunately, it cannot build the project with those settings. it results in

c:/users/tinu/.platformio/packages/toolchain-xtensa32/bin/../lib/gcc/xtensa-esp32-elf/8.4.0/../../../../xtensa-esp32-elf/bin/ld.exe: final link failed: bad value collect2.exe: error: ld returned 1 exit status *** [.pio\build\esp32dev\firmware.elf] Error 1

what's going wrong?

Jason2866 commented 2 years ago

@bongoo1 Delete the hidden folder .platformio. You have old stuff which makes troubles. Easiest way to use 2.0 3-rc1 with Platformio, is my fork. Change and only change the platform entry.

platform = https://github.com/Jason2866/platform-espressif32/releases/download/v2.0.3-rc1/platform-espressif32-2.0.3-rc1.zip
bongoo1 commented 2 years ago

@Jason2866 , where can i find the .platformio folder? i just see a folder .pio

bongoo1 commented 2 years ago

@Jason2866 , when pointing to the zip you mentioned, there remains only this warning:

"C:\Users\tinu.platformio\packages\framework-arduinoespressif32\tools\sdk\esp32\include\config" wurde nicht gefunden.

so this looks much better now, thanx!

so does this still need this setting below?

platform_packages = framework-arduinoespressif32 @ https://github.com/espressif/arduino-esp32.git#master toolchain-xtensa32@~2.80400.0

or do i need to change something?

and is this zip expected to be something persistent or will this be removed once a newer version is available? i'm just wondering if i will need to change some settings when recompiling in a few month.

Jason2866 commented 2 years ago

@bongoo1 Your setup is bad. Just what i wrote. NOTHING!!! in platform_packages

In a few month there is for sure a official release from Platformio. So no need for workarounds anymore. NEVER put special characters . in pathes tinu.platformio Weird can happen. You can use a underscore _

bongoo1 commented 2 years ago

@Jason2866 , looks like i just was not patient enough. in the meantime it also compiles when i remove the platform_packagessetting. btw i do never use a dot in a folder name. looks like copy paste removed the backslash before the dot. thanx for helping!

VojtechBartoska commented 2 years ago

@bongoo1 can I consider this as solved? PlatformIO is also now with official support of Arduino ESP32 core v2.0.0.

https://piolabs.com/blog/news/platformio-oss-april-2022-updates.html

bongoo1 commented 2 years ago

using the workaround to get the 2.0.0. based core helped in getting rid of the frequent random reboots so far. what i see instead is that after about 10 to 14 days, network connection is lost. disconnecting/reconnecting the network cable usually seems to solve the issue, but only for a few hours to pop up again. so what's required then to fully recover is a power cycle. i assume that there is some kind of memory/resource leak either within the esp or the phy.

as there is an official release now for 2.0.0 now, i reconfigured the first one of my designs to use this one, and also added some functionality which is expected to do an automatic reboot once a week. hope that this will help to improve reliability.

VojtechBartoska commented 2 years ago

@bongoo1 Can I consider this covered?

bongoo1 commented 2 years ago

@VojtechBartoska unfortunately, switching to the official release did not seem to solve the last issue i mentioned. while there are no more random reboots, the lan access seems to get instable after a few days. tcp connections get lost randomly, and reconnecting does not always work. i suspect that when a tcp connection breaks, this does not free all resources (either on the lan8720 or on the resources of the esp32, used to handle tcp). so after a few reconnections, i run out of resources and reconnection fails. then all i can do is a power cycle.

VojtechBartoska commented 2 years ago

we will investigate this, adding this to our issue Roadmap