whitecatboard / Lua-RTOS-ESP32

Lua RTOS for ESP32
Other
1.2k stars 221 forks source link

Crash in pcalled function on error. #211

Closed xopxe closed 5 years ago

xopxe commented 5 years ago

I call a Lua function using int status = lua_pcall(TL, 8, 0, 0);. If there's an error in the Lua code it is not caught, and the system reboots.

For example, if I add local x = 1+nil I get:

/home/xopxe/sources/esp/esp-idf/components/newlib/locks.c:136 (lock_acquire_generic)- assert failed!
abort() was called at PC 0x40082dcb on core 0

Backtrace: 0x40094eef:0x3ffcebe0 0x40095211:0x3ffcec00 0x40082dcb:0x3ffcec20 0x40082ee5:0x3ffcec60 0x401b8792:0x3ffcec80 0x400e7399:0x3ffceca0 0x400e7655:0x3ffcece0 0x400f883f:0x3ffced00 0x400f8a38:0x3ffcee80 0x400f8bfd:0x3ffceeb0 0x400f2cfd:0x3ffceee0 0x400f2f52:0x3ffcef10 0x400f2f79:0x3ffcef30 0x400ff8f5:0x3ffcef50 0x400f2925:0x3ffcef70 0x400f3102:0x3ffceff0 0x40100761:0x3ffcf020 0x400f896c:0x3ffcf050 0x400d58df:0x3ffcf080 0x400d5463:0x3ffcf0a0

Rebooting...

Can you confirm that pcall is working for capturing errors?

jolivepetrus commented 5 years ago

@xopxe,

It should work. For example, the following code works as expected:

thread.start(function() local x = 1+nil end) stdin/ > :2: attempt to perform arithmetic on a nil value

Internally, the thread.start function invokes the Lua function using lua_pcall, and the error is caught.

jolivepetrus commented 5 years ago

@xopxe,

Maybe, silent stack overflow?

xopxe commented 5 years ago

You mean some problem when handling the stack on the C side? I'll check my code again.

jolivepetrus commented 5 years ago

@xopxe,

Yes. Sometimes a problem in the stack size can cause problems, and sometimes stack overflows are not reported to the user.

xopxe commented 5 years ago

First a comment, I'm building the code as C++, it could mean something. I only named the file as .cpp, wrapped the code in

#ifdef __cplusplus
extern "C"{
#endif
...
#ifdef __cplusplus
}
#endif

and did not touch anything else.

I'm using a timer event created with xTimerCreate (somewhat like the tmr module does)

In the callback function associated to the timer, printf does not write anything to the console (in other parts of same program it does work normally). Same happens with fprintf (stderr, ...)

Also I'm not getting output when doing print(...) in the Lua callback. I can do uart.write(uart.CONSOLE, ...), tough.

This had me confused, because the system crashed after a Lua error when doing correctly a luaL_error(TL, msg);, but with no output of error message.

Any idea or pointers?

jolivepetrus commented 5 years ago

@xopxe,

Please, review your "FreeRTOS timer stack size" setting in your sdkconfig (make menuconfig), under "Component config -> FreeRTOS" category.

Lua RTOS uses 2048 bytes for the timer stack, I remember that I had to change the default value to 2048. Start playing with 2048 bytes, and increase it by 1024 in case of problems.

xopxe commented 5 years ago

Went up to 32768 with no change in behaviour. (i edited a comment above with a typo for printf).

xopxe commented 5 years ago

There seems to be a problem with fprint in xTimer callbacks and stack overflows (here and here). Also, does that mean that there are problems with lua_error, lua_writestringerror and the like in a xTimer callback?

Nevertheless I don't understand well what is going on, and can not detect any stack overflow. Also, I see that 'nano' formatting is enabled (in the above links it is mentioned as a remedy), so perhaps this is unrelated.

xopxe commented 5 years ago

I can confirm that the problem is related to using print in xTimer callbacks, only. Print in xTimer callback generate no output, while in other callbacks (e.g. encoder) they work. The same happens with printf in the C part, no output. luaL_error crashes without printing the error. In the example bellow t0 does not print anything, t1 writes to the serial successfully, and t2 crashes on error. I also modified the callback to just call lua_writestringerror (which uses fpintf) instead luaL_error, and the callbacks fails silently without printing anything.

/ > t0 = tmr.attach(500, function() print '!' end )
/ > t0:start()
/ > t1 = tmr.attach(500, function() uart.write(uart.CONSOLE, '!') end)
/ > t1:start()
/ > !!!!!!!!!!t1!!!:!s!t!op!()!
/ > 
/ > t2 = tmr.attach(500, function() a=1+nil end)
/ > t2:start()
/ > /home/xopxe/sources/esp/esp-idf/components/newlib/locks.c:136 (lock_acquire_generic)- assert failed!
abort() was called at PC 0x40082dcb on core 0

Backtrace: 0x40094eef:0x3ffcec40 0x40095211:0x3ffcec60 0x40082dcb:0x3ffcec80 0x40082ee5:0x3ffcecc0 0x401b88da:0x3ffcece0 0x400e739d:0x3ffced00 0x400e7659:0x3ffced40 0x400f8843:0x3ffced60 0x400f8a3c:0x3ffceee0 0x400f8c01:0x3ffcef10 0x400f2d01:0x3ffcef40 0x400f2f56:0x3ffcef70 0x400f2f7d:0x3ffcef90 0x400ff8f9:0x3ffcefb0 0x400f2929:0x3ffcefd0 0x400f3106:0x3ffcf050 0x40100765:0x3ffcf080 0x400f8970:0x3ffcf0b0 0x400d58df:0x3ffcf0e0 0x400d5463:0x3ffcf100

Rebooting...

So, it's a esp-idf problem?

jolivepetrus commented 5 years ago

@xopxe,

The problem is that the timer task is created before the creation of the standard input/output streams, so the timer task has no i/o. The tmr Lua RTOS module use FreeRTOS timer, and FreeRTOS execute the Lua callback into the timer task context, and the callback hasn't the standard input/output streams defined.

I'm working on that, just for now, the following solves the problem:

static void callback_sw_func(TimerHandle_t xTimer) { tmr_userdata tmr = (tmr_userdata )pvTimerGetTimerID(xTimer);

__getreent()->_stdin  = _GLOBAL_REENT->_stdin;
__getreent()->_stdout = _GLOBAL_REENT->_stdout;
__getreent()->_stderr = _GLOBAL_REENT->_stderr;
...
...

}

jolivepetrus commented 5 years ago

@xopxe,

Solved in https://github.com/whitecatboard/Lua-RTOS-ESP32/commit/bfff9cf46037ed7e847e5a36f0188bdfd1f36fef. Maybe you have to adjust your timer stack size in KConfig.

xopxe commented 5 years ago

Thanks! I'm playing with with right now. Prints do work, but errors in the Lua callback are silently ignored. Perhaps change luaS_callback_call() in components/lua/common/sys.c to:

void luaS_callback_call(lua_callback_t *callback, int args) {
    int status = lua_pcall(callback->TL, args, 0, 0);

    if (status != LUA_OK) {
        const char *msg = lua_tostring(callback->TL, -1);
        lua_writestringerror("error in callback %s\n", msg);
        lua_pop(callback->TL, 1);       
    }

    // Copy callback to thread
    lua_pushvalue(callback->TL, 1);
}
jolivepetrus commented 5 years ago

@xopxe,

Yes, you are right. For now errors are ignored by a design decision. The thing is that we have to decide what happens if an error is raised, and make it compatible with The Whitecat IDE, in the block based programming environment.

xopxe commented 5 years ago

Ok, good to know. Perhaps I'll keep the modified luaS_callback_call in my branch and then merge when you decide a design.