esp8266 / Arduino

ESP8266 core for Arduino
GNU Lesser General Public License v2.1
16.07k stars 13.33k forks source link

Implement Esp.getFreeSysStack #5148

Open toomasz opened 6 years ago

toomasz commented 6 years ago

This issue was originally named Stack usage in WiFi.onStationModeGotIP too high, probably overflow

Below original report showing my wrong understanding of ESP internals

Basic Infos

Platform

Settings in IDE

Problem Description

I was experiencing random crashes in my project which is using ESPAsyncTCP and async-mqtt-library so i decided to check stack usage in event handlers on mqtt library, especially in publish received. Found out that free stack size is below zero. I was getting results like current free stack = -4000

I also checked stack usage in onStationModeGotIP event handler and saw free stack is below zero.

Im not sure if this method of measuring free stack is good enough. In setup method is shows areound 4Kb of free stack which seems to be correct.

Is it a bug? Does 'negative' free stack means that stack overlaps with heap region?

MCVE Sketch

#include <ESP8266WiFi.h>
WiFiEventHandler wifiConnectHandler;

extern "C"
{
#include <cont.h>
    extern cont_t* g_pcont;
    void DebugFreeStack()
    {
        register uint32_t *sp asm("a1");
        int freestack = 4 * (sp - g_pcont->stack);
        Serial.printf("current free stack = %d\n", freestack);
    }
}

void connectToWifi() 
{
    Serial.println("Connecting to Wi-Fi...");
    WiFi.begin("xxx", "xxx");
}

void onWifiConnect(const WiFiEventStationModeGotIP& event) 
{
    Serial.println("Connected to Wi-Fi.");
    DebugFreeStack();
}
void setup() 
{

    Serial.begin(115200);
    delay(2000);
    Serial.println();
    Serial.println();

    Serial.printf("version: %s\n", ESP.getFullVersion().c_str());

    wifiConnectHandler = WiFi.onStationModeGotIP(onWifiConnect);
    Serial.println("Setup");
    DebugFreeStack();

    connectToWifi();
}

void loop() 
{
}

Debug Messages


version: SDK:2.2.1(cfd48f3)/Core:win-2.5.0-dev/lwIP:2.0.3(STABLE-2_0_3_RELEASE/glue:arduino-2.4.1-13-g163bb82)/BearSSL:6d1cefc
Setup
current free stack = 4016
Connecting to Wi-Fi...
Connected to Wi-Fi.
current free stack = -240
earlephilhower commented 6 years ago

There are two stacks in play: the OS one (sys) and the Arduino core one (cont). Callbacks from the OS will be on the OS stack, so "getStackFree()' is not valid and will report nonsense values when called from a callback.

Since this is a CB, I think you're getting an invalid result and it's not indicative of any problem here.

toomasz commented 6 years ago

Didn't know there are two stacks. So probably most of my code is executing in OS stack since it's implemented in callbacks. What is the size limit for OS stack? I know Arduino core stack is 4kb. Also any way to get current free space of OS stack?

earlephilhower commented 6 years ago

The way stacks are done now, the Arduino one is actually allocated from the OS stack, so I'm not sure there's one fool-proof way of getting the free stack if you don't know if you're in the OS or in the app.

@d-a-v may have a better way, but if you know you're in the OS then there I believe the stack limit is going to be 0x3FFFC000, and you can subtract a1 to see space left. There's probably a global symbol with the real beginning of heap you could use as well.

d-a-v commented 6 years ago

@toomasz I currently have no better way.

The API now includes Esp.getFreeContStack() (#5133). We will welcome a PR with Esp.getFreeSysStack(). I believe @earlephilhower is right with the given limit.

Remember we are not in a multitasking environment, so our sketches should be state-machines and callbacks should be minimal - at best only used for setting some global variables that the main loop examines.

toomasz commented 6 years ago

Created PR with Esp.getFreeSysStack(), please review it. Im getting free sys stack size around 12kb

callbacks should be minimal - at best only used for setting some global variables that the main loop examines.

In my case its problematic, my library is using async-mqtt-client which is based on ESPAsyncTCP which is using lwip. So probably all callback from mqtt library are fired on OS sys if I'm not wrong. I'm receiving mqtt messages which might be up to 1000 bytes in size so I would have to queue them and then process them on main arduino loop. Would be easy to overflow that queue. Also, wouldn't that require some kind of locking?

TD-er commented 6 years ago

Why would those queues be allocated on the stack? After all you need some place to keep it, but that could be a pointer to some block allocated on the heap at boot (and thus not really adding to heap fragmentation)

I am really interested in this, since I am convinced the random reboots seen at our project (ESPeasy) may be related to stack usage.

devyte commented 6 years ago

@TD-er There are 2 stacks: the sys and cont. The sys is used by the sdk, and is als the one in use in certain callbacks, like Ticker. The cont is the one used in our Arduino setup() and loop(), as well as all functions called from them. The cont is 4KB in size, and used to be allocated on the heap. However, some research found that the sys stack is very big, something like 9 or 10KB, and most of it is unused after boot. So we moved the cont stack on top of the sys stack, and that frees up 4KB additional heap. If you use wps in your project, this optimization is automagically reverted, and the cont stack goes back to heap. The reason for this is that wps seems to make large use of the sys stack. If your code makes large use of the sys stack in callbacks such as the Ticker, then you could be stomping over the cont stack. But you shouldn't be doing that, unless you know what you're doing. The cont stack max watermark can currently be retrieved by ESP.getFreeContStack(). The sys stack doesn't have a watermark currently.

If you're using other sdk functionality similar to wps, it's possible that you're running into the same problem as that case, and the optimization needs to be disabled for your project.

@d-a-v is there a way to manually disable the stack optimization for testing?

d-a-v commented 6 years ago

is there a way to manually disable the stack optimization for testing?

disable_extra4k_at_link_time(); needs to be called from anywhere (like from setup()). It does nothing but force the linker keep user's stack in user ram.