nodemcu / nodemcu-firmware

Lua based interactive firmware for ESP8266, ESP8285 and ESP32
https://nodemcu.readthedocs.io
MIT License
7.64k stars 3.12k forks source link

Progress on testing Lua 5.3 #2890

Closed TerryE closed 4 years ago

TerryE commented 5 years ago

Apologies for my bad discipline but this issue has morphed into a status update on my testing and to avoid confusion, I have decided to rename it accordingly.


Now that some developers are putting a lot of functions in LFS we are starting to see a scaling issue which I feel that we need to address in Lua5.3. See the relevant Handling LFS-based functions section in my migration working paper for more details.

The tl;dr is that I will be adding a ROTable index to LFS, in a slightly botched way as doing the Right Way (as Johny would describe it) involves too much work at this stage, IMO. This will be transparent at a LFS.module level with a few tweaks to _init, but module resolution will be faster and scale a bit better.

If the committers are happy then I'll close this.

HHHartmann commented 5 years ago

Maybe some metatable magic could also help to convert the UData to a function on the fly. From Lua 5.2 on there is also pairs and ipairs.

TerryE commented 5 years ago

Maybe some metatable magic could also help to convert the UData to a function on the fly.

I think that we are saying the same thing. :smile: The __index, __call and __pairs meta-methods as described in the Lua RM are relevant, but the complication here is address resolution. In a C module you statically declare a metatable or a function and the GCC compiler / linker will correctly resolve these references these in another ROTable, but this doesn't work between ROTables in LFS images and the firmware because the luac.cross compiler generated image isn't linked to a specific image, and so the meta wrapping needs to be resolved at runtime.

We can do this in Lua using a table as we do currently with our _init LFS implementation -- which is what I was talking about. We could also implement this in C using an empty Udata which adds a little more work but would be more runtime efficient.

The advantage of adding a lightweight Lua function as a proper function subtype is that none of this would be necessary as the TValue in the ROTable could directly reference this Proto, so the table would just work. But adding this subtype support involves a lot more work than tweaking _init which is why I am suggesting going for this KISS approach for now.

PS. I will be finishing the first pass on updating the lua53 code base today. Following that:

HHHartmann commented 5 years ago

Ok I think I got it. Couldn't you then create a new table in node.flashindex which accesses the ROTable in its metamethods? That way it would be a snap in solution and not everyone had to update their version of _init.lua

TerryE commented 5 years ago

IMO, the two things do the same job so we shouldn't implement two LFS variant.

I also feel that it is acceptable to have a compatibility break by replacing the current node.flashindex(). We encourage developers to use require search paths and the LFS table abstraction to encapsulate this, and the updated _init.lua will hide the break. I feel that this is the sort of "you need to change these 3 lines to migrate your app" that developers should anticipate when doing a Lua version upgrade.

TerryE commented 5 years ago

Following that:

  • Get the LUA=51 and LUA=53 builds compiling clean.

  • Run some quick regressions test to make sure the 51 versions of the firmware and the luac cross compiler are still working.

  • Debug out the Lua53 versions of the luac cross compiler and get some -e scripts running to check that the new Lua 5.3 execution environment is solid.

  • Run the luac cross compiler execution environment against the full Lua test suite and debug.

  • Debug out the firmware execution environment on ESP.

By way of an update, I have now done all bar the last of these steps. Getting the luac.cross build of the LUA=53 was straightforward as this is a POSIX / newlib compatible environment. Getting the Lua tests clean was quite a detective challenge hunting down the funnies, e.g. tests were using dynamic loading, trying to modify standard library tables such as adding `table.maxn, as well as some genuine bugs that the testing exposed. Still, these are now clean.

Getting the firmware build of the LUA=53 was a PITA as the source is only newlib compatible, but the ESP8266 RTLs are not newlib complete, so hunting down API calls that the SDK and other libraries don't support (e.g. abort(), modf() was fun. Still both LUA=51 and LUA=53 link cleanly now, with everything else shared.

Onto on-ESP8266 firmware debugging next :)

TerryE commented 5 years ago

Still plugging away with the Lua53 alpha firmware. This make pulls in a couple of extra libgcc.a modules (which get linked into IRAM0 by default) and when I also have the gdbstub loaded, this blows the 32Kb limit for IRAM0. My workaround is to disable GPIO which also drops the HW timer code and this brings the size back under the threshold to load. I've done an analysis of which libgcc.a modules are used by the SDK libraries and libphy.a functions which use libgcc.a functions that are not already implemented in the Boot ROM. In the longer term, I will need to add extra linker directives to force the ones not used by the phy_* functions into IROM0.

The challenge here is that any code that must be callable from within an ISR or must work through SPIFFS I/O can't be cached and can't be in IROM0. We are operating very close to the 32Kb boundary for IRAM0.

I've put in a lot of long and tiring hours to get to this point. The concept of the LUA=51 / LUA=53 make option with everything else common is working well. I hope to get a useable, albeit perhaps buggy, Lua53 working in the next week or so.

TerryE commented 5 years ago
NodeMCU 3.0.0.0 build unspecified powered by Lua 5.3.5 on SDK 3.0.1-dev(fce080e)
cannot open init.lua: 
> print(node.heap())
43368
> function enum(t) for k,v in pairs(t) do print(k,v) end end
> enum(_G)
enum    function: 0x3fff10b8
__index table: 0x3ffee574
ROM table: 0x3ffee574
_VERSION    Lua 5.3
_G  table: 0x3fff0350
> enum(ROM)
string  table: 0x40282964
table   table: 0x40283954
debug   table: 0x40283580
coroutine   table: 0x40283c0c
math    table: 0x40282180
utf8    table: 0x40282f78
ROM table: 0x3ffee574
assert  function: 0x402511d0
collectgarbage  function: 0x40250cbc
dofile  function: 0x40251174
error   function: 0x40250a80
getmetatable    function: 0x40250ae0
ipairs  function: 0x40250f20
loadfile    function: 0x40250fb0
load    function: 0x402510a8
loadstring  function: 0x402510a8
next    function: 0x40250e4c
pairs   function: 0x40250e98
pcall   function: 0x4025135c
print   function: 0x4025075c
rawequal    function: 0x40250b9c
rawlen  function: 0x40250bd8
rawget  function: 0x40250c34
rawset  function: 0x40250c6c
select  function: 0x40251230
setmetatable    function: 0x40250b2c
tonumber    function: 0x40250988
tostring    function: 0x40251410
type    function: 0x40250d90
unpack  function: 0x40251444
xpcall  function: 0x402513b0
gdbstub table: 0x40284580
uart    table: 0x40284778
i2c table: 0x40284990
spi table: 0x40284c30
node    table: 0x40285208
file    table: 0x40285768
pipe    table: 0x402859a0
bit table: 0x40285c48
tmr table: 0x40285e38
net table: 0x402864e0
wifi    table: 0x402875b8
dht table: 0x40287a98
> enum(net)
createServer    function: 0x40261484
createConnection    function: 0x402614fc
createUDPSocket function: 0x4026145c
multicastJoin   function: 0x4026117c
multicastLeave  function: 0x40261168
dns table: 0x40286550
TCP 1
UDP 2
> 

It's looking good, but still Alpha, IMO. As soon as @nwf or one of the other committers merges #2886 (No squash for this one), then I can start raising the PRs.

TerryE commented 5 years ago

As soon as we've merged #2886, I want to merge #2859 into dev and rebaseline all of this work against this new dev. This PR is a BIG change, so I don't see us rushing it into dev. What I will do is to cherry-pick this into separate commits. This is for review purposes only as the changes are actually interlocked because of the now common module API for both Lua 51 and Lua 53. The commits are:

  1. Changes to app/lua needed to support the unified API.
  2. Changes to all other app directories less modules and lua53 needed to support the unified API.
  3. Changes to app/modules needed to support the unified API. Note that these three could be picked to do a Lua (5.1) build.
  4. Addition of app/lua53. This is just a reorganisation of the standard Lua5.3 distro files splitting out the host-only files used to compile luac.cross but with no content changes. This isn't a working commit, but more to allow sensible review of the following.
  5. NodeMCU changes to app/lua53. With these two extra commits both the LUA=51 (default) and LUA=53 makes work. This lua53 version doesn't as yet include LFS.
  6. Addition of LFS support to Lua53.
  7. Addition of Lua 5.3 documentation.

I will extremely appreciate any reviewers going through these changes though perhaps the critical one is (5) together with any modules that interest you in (3).

One slight complication is that I am going to Greece for ~6 weeks, but I will be taking my PC and continue working on this a few days a week.

PS. This has been a heavy few weeks for me, but I want to take a break for a few more days to brood about how we wrap LFS into Lua 5.3

jmattsson commented 5 years ago

Sounds great Terry! As I'm about to head overseas myself this coming week I won't be able to do any reviews for bit. I'll try to catch up once I'm back (~3 weeks)!

Looking at this list I reckon we should land at least up to and including (3) before I tackle the next step of the 8266/32 branch merge, or we'll be stepping on each other's toes needlessly (and I already did that enough to you with the c_ prefix cleanup!).

TerryE commented 5 years ago

I reckon we should land at least up to and including (3)

To be clear my current working branch is at (5) with Lua 5.3 working with and on the module libraries that I've tested. Given that you need to use the make target LUA=53to generate a Lua 5.3 firmware build, IMO we should land all 5. I haven't disabled any Lua (5.1) functionality.

During testing I've found a few "palm slap on the forehead" bugs / omissions and no doubt others will have slipped through which is why I describe this as alpha code.

As to the c_prefix cleanup, this was a major improvement in my view so no complaints or regrets here.

TerryE commented 5 years ago

OK, I've just got the LUA=53 32-bit Integer + FP build working and testing clean with the Lua test suite. It seems to work fine on the ESP8266 as well. TValues are 8-bytes same as for the 5.1 integer build but you've got FP and full math library support, The run-times vary but are usually around 60-70% of the 5.1 equivalents though a lot more benchmarking is needed.

I really don't see why would need double precision FP on an IoT device, but I will listen to reasoned arguments. All looking good.

TerryE commented 5 years ago

Another small change implemented in the ld/nodemcu.ld file is that some of the SDK ISR code that loads into the IRAM0 segment calls some libgcc routines. Most of this are provided through the ROM, but a few aren't in the ROM, so the ld mapped libgcc routines in IRAM0. My analysis shows that in fact they only use __ashrdi3, _divsf3, __fixsfsi and __modsi3, so I've updated the ld so that only these routines get mapped into IRAM0, and all of the others are mapped into Flash. This frees up a scarce 600 bytes or so of IRAM0.

TerryE commented 4 years ago

@jmattsson @nwf @HHHartmann and other committers. This is a quick progress checkpoint triggered by Johny's return from holiday.

So far the work has gone really well:

The one material outstanding issue IMO is #2917. This isn't a technical show stopper. It's just that we have a choice of options, each of which has different strengths and weaknesses. From a development and support perspective we should try to focus on the minimum essential subset. I desperately need constructive dialogue / input from at least one of the committer to agree the right path here.

nwf commented 4 years ago

AIUI the usual reason for 32-bit int and 64-bit float builds for scripting languages is that a 64-bit float can represent all 32-bit integers (and then some) exactly, so it doesn't much matter, to a typical user, what the automatic type conversion behavior is.

jmattsson commented 4 years ago

Sounds great Terry, but I'm with @nwf on keeping double as the default on at least 5.1. The bit module (for one) will not behave like people might assume otherwise, and would likely cause regressions for our users. On 5.3 with its separate ints and floats, 32/32 would sound more reasonable, and might play nicely into the esp32's hardware float support, though in all honesty I don't feel I'm fully across the implications right now. Here concludes my $(float)0x3E4CCCCC of input.

TerryE commented 4 years ago

Lua 5.1 only has one number type: 64-bit IEEE double, so I am not intending to (and in practice can't) change this. As far as 5.3 goes, now that Ive sorted the library issues, we are talking about toggling a couple of defines in luaconf.h to switch from 32/32 to 32/64, so this can be a make option.

However, the TValue is the core quantum of storage in Lua and dropping the size of this from 12 to 8 bytes will give you a 25-30% effective increase in available RAM. IMO, the Lua 5.3 int/float build will have all of the size and runtime benefits of the old 5.1 Integer builds, but with the ability to drop into 6 decimal place floating point if needed. I really can't understand any need for 16 d.p. on an IoT device where most DAC and ADCs are typically 12bit and sometimes 16bit. So I am suggesting that the default be 8-byte TValues, though we should document for advanced used how to do an int+double 12-byte TValue build if they really want to.

In terms of bit etc, it is still available, but given that Lua 5.3 has the bit operators in the language, there is little point in calling this if you are writing in native 5.3.

jmattsson commented 4 years ago

Cool, sounds like we're on the same page after all - sorry about the noise :)

flip111 commented 4 years ago

Hi @TerryE thank you for all the great work on the firmware with lua5.3

It has come to my attention that lua 5.4 just arrived in beta http://lua-users.org/lists/lua-l/2019-10/msg00003.html The thought of skipping lua53 as has been done with lua52 and go directly to lua54 comes to mind.

Of course the changes from lua53 to lua54 are big enough for the lua developers to warrant a new small version number. However from the point of view of the intergration with the SDK of the MCU the change from lua53 to lua54 might either be small or big.

Perhaps by the time that the lua53 version of the firmware is really ready and stable lua 5.4 will also be stable.

I don't know if you have already peeked at the new lua5.4 to say anything about it. I couldn't find when lua54 is actually supposed to be finalized. But coming out of alpha and going into beta says already a lot about it's stability imo.

Anyway possibly you want don't want to focus on lua54 now and just finish lua53 as is. But i just wanted to bring it to your attention.

TerryE commented 4 years ago

@flip111, when I last looked Lua 5.4 was still in Alpha, but on checking I see that it moved into a Beta RC1 a couple of days ago. As the current version is based on 5.1 which was replaced by 5.2 in 2012 if I recall correctly. Internally Lua 5.3 is a pretty major rework of the Lua VM, and it has some major improvements that give us a compelling reason to upgrade from 5.1, and in particular the integer+float numeric subtypes mean that the best build for IoT use is a 32bit int / 32bit floating point which uses 8 byte TValues just like the old 5.1 integer builds.

The Lua 5.4 core has some minor language tweaks; it's going to be at least 12 months until we have a truly stable 5.4.2 or whatever, so my current intent is not to wait but to stay with the tried and tested Lua 5.3.5 baseline.

TerryE commented 4 years ago

Addressed in #3705