nodemcu / nodemcu-firmware

Lua based interactive firmware for ESP8266, ESP8285 and ESP32
https://nodemcu.readthedocs.io
MIT License
7.64k stars 3.12k forks source link

Execution of Lua out of flash memory #2292

Closed TerryE closed 6 years ago

TerryE commented 6 years ago

New Feature

This issue supersedes #2068, which contains all of the historic discussion. My devLFS branch contains a working version of the functionality described in detail in this LFS whitepaper.

This is simply an Alpha cut so I have committed an as-is snapshot of my system (that is with development options still enabled) and so it is not ready for incorporation into dev, hence this isn't a PR at this stage. However, the top level highlights are as follows:

I'll do some more tidy up and improve the whitepaper in the next few days, but the reason for this new issue is to enable a dialogue with the other committers and developers based on the actual code rather than a concept, so that I can sort some basic feedback stuff before raising the PR. Enjoy.

TerryE commented 6 years ago

BTW, guys and gals -- once you've loaded a set of files into flash. you can do:

node.flash.index()('init')()

to run the init.lua file in the LFS, and this can do stuff life set up a flash table/metatable with an __index entry to a 3 line routine to to return the corresponding LFS closure, flash.myfunct(x,g,r) (assuming that LFS has been enabled otherwise node.flash is nil and node.flash.index returns nil is the LFS isn't loaded. If you want a safe verson then there are various rings and changes but

pcall(function()node.flash.index()('init')()end)

is probably the simplest.

Also add an entry into package.loaders. entry [3] is good one since this is the C loader which isn't used on NodeMCU the function must either return a Lua function or and error message. Using [3] means that SPIFFS is ahead of LFS in the search path.

@georeb, you hassled me for long enough to release this. It does everything that you want. I am surprised that you haven't started testing this.

All, if I get no feedback in the next few days, then I do a PR to dev with LFS disabled by default. If no objections after a week, then Ill merge it into dev myself.

georeb commented 6 years ago

I am sorry if you felt hassled! I have just been excited to try it out! I have been out of the country for the last 2 weeks and so haven't had a chance to try it. I shall report back as soon as possible! This is great! Thank you @TerryE :)

pjsg commented 6 years ago

I built the image, but node.flash.index() returns nothing. I had one lua file (init.lua) in the right directory to get the build to work.

... time passes

I uploaded the flash.img file and did node.flash.reload('flash.img') It appeared to flash it and then reboot

NodeMCU 2.1.0 build unspecified powered by Lua 5.1.4 on SDK 2.1.0(116b762)
lua: cannot open init.lua
> Heap size::43000.
=node.flash.index()
function: 3fff05f8
>=node.flash.index()('init')()

 ets Jan  8 2013,rst cause:2, boot mode:(3,7)

load 0x40100000, len 30652, room 16 

It appears to restart whenever I try and find any module..... I wonder if I uploaded a corrupt module or something.

TerryE commented 6 years ago

@pjsg, Phillip can you either link a gist to your test application or email it to me and I will run it through the debugger to see that is going on.

Whatever, you shouldn't do a bare node.flash.index()('init')() if you are getting a reboot, as this is the correct behaviour if the Lua RTS throws an un-caught error. That's why you need the pcall wrapper to catch any exceptions thrown by Lua.

I have just put a one-line init.lua in local/lua (along with some other lua files) and set the spiffs base and size to 0x100000, 0x10000 and done a make to generate the a new fs image and spiffs image followed by:

esptool --port /dev/ttyUSB0  write_flash -fm dio 0x100000 $BIN/0x100000-0x10000.img
screen -L -l /dev/ttyUSB0 115200
NodeMCU 2.1.0 build unspecified powered by Lua 5.1.4 on SDK 2.1.0(116b762)
lua: cannot open init.lua
> =file.list()['flash.img']
10320
> node.flash.reload'flash.img'

 ets Jan  8 2013,rst cause:2, boot mode:(3,6)
...
NodeMCU 2.1.0 build unspecified powered by Lua 5.1.4 on SDK 2.1.0(116b762)
lua: cannot open init.lua
 > =node.flash.index
lightfunction: 0x40246200
> =node.flash.index()
function: 0x3fff0768
> =node.flash.index()()
1520899799      _dummy  goodbye he1     hello   init    preload telnet  test
> =node.flash.index()'init'()
You have just entered the LFS init
> =#debug.getstrings'ROM'
138
> =node.heap()
42744

So I have about 300 lines of Lua comprising 8 modules and 138 strings in my LFS (the code is a bit denser than RAM, because we don't have all of the malloc housekeeping overhead) and I still have 42Kb RAM free.

So send me the example and I'll have a look. :smile:

Alternatively set -O0 -ggdb in the app/lua make and see lua.c:292 enable the bootsrap debug pin and see user_connfig.h:22 which sets an debug hook on lua_assert fail if you enable DEVELOPMENT_TOOLS for the lua subdir. Also once enabled any exceptions will throw you into the debugger (instead of restarting) where you can examine / continue ... I've also started to do a load of helpful macros in the root .gdbinit, but again note that you'll need to use a telnet server to do interactive work once you've initialised the debugger.

PS: IIRC, I've moded the spiffs code to be LFS aware and load a base spiffs image above the LFS region, but not your spiffimg algo which still uses _flash_used_end unless you explicitly set SPIFFS_MAX_FILESYSTEM_SIZE and SPIFFS_FIXED_LOCATION. So the simplest thing to is to set these to 64K and 1Mb resp. so your spiffs doesn't stomp over the LFS or v.v. (See the tools makefile).

As I said this is still Alpha code and I haven't dotted all the i's

pjsg commented 6 years ago

The 'test' app was just this in init.lua:

print("Hello world!")

I'll do some more digging in.

TerryE commented 6 years ago

@pjsg, Philip this is a big patch, bigger than the original elua mods to the Lua core, and it needs better documentation, and some decent examples. Also in order to get it working I had to sort out quite a lot of other stuff: bringing luac.cross into the core build and supporting LFS generation, sorting out correct spiffsimg sizing and location, getting remote gdb truly usable; making libs like sqlite3 only be built if enabled so my VM would make, ....

However, maybe a good way to sync up would be for us to chat offline on Sykpe or WhatsApp when convenient :)

TerryE commented 6 years ago

Health warning: Integer builds work with LFS disabled, but not with it enabled. The 64bit->32bit packing routines need some extra code variants to handle integer TValue sizes, etc. Not a big job -- but one that I haven't got around to yet :scream:

The latest push fixes this.

TerryE commented 6 years ago

I've just done a PR as per my comment above 5 days ago. Phillip had some fun getting this going -- until I twigged that he uses Integer builds by default, and I hadn't put the extra variant code in the cross-compiler to repack for 32-bit TValues. Once I fixed this then we've both got this working, but not fully stress tested.

Using this really merits its own FAQ plus Lua examples for the LFS based init code to add autoloading, etc. but gven that this patch is as large as the original core patch for eLua, it is better to get this version into the hands of developers for evaluation.

TerryE commented 6 years ago

The tl;dr from the review comments is that we've had 3 of our committers (Johny, Phillip and myself) go through this patch and we haven't picked up any show stoppers. There are some fine grain tweaks, that I need to pick up with another incremental push, but the main concern voiced by Johny relates to the overall API.

Thinking about his comments, I do think that could easily make life simpler for Lua developers:

So

I also suggest that the two rebuildflashand getflash are also in non-LFS builds but return the boolean false on these.

Lastly Marcel has requested that we delay any commit to dev until after the SDK2.2 commit and snapshot to master.

Comments?

georeb commented 6 years ago

Lastly Marcel has requested that we delay any commit to dev until after the SDK2.2 commit and snapshot to master.

All sounds very promising! When is the expected 2.2 commit date?

So if I am understanding correctly, we execute node.getflash("mqtt") for example and then run the mqtt module as we normally would, but transparently, through the flash rather than RAM?

TerryE commented 6 years ago

@georeb the C libraries like mqtt always run from Flash anyway. However if you have a Lua module, then you can put it in flash and the normal require "mqtt"with pick it up (so long as you don't have a copy in SPIFFS, because SPIFFS will be searched first). As an alternative

local mq = node.getflash("mqtt")("mqtt")

does the same thing directly with using the loader search list..

georeb commented 6 years ago

So when you say Lua module; that could be a standalone function that uses the mqtt library, saved within the SPIFFS (or in FLASH at build) as mymqttfunc.lua My main program then requires that Lua file, but instead of dumping it into RAM, it runs it directly from FLASH?

I may be misunderstanding here, sorry - I don't fully understand the underlying code. A couple of complete examples would be much appreciated...

TerryE commented 6 years ago

@georeb, Why don't you just read the working paper that wrote that which explains all of this? Instead of asking Qs about stuff that I've already documented. You can use luac.cross to build an image which can be loaded into flash. This includes all of the code and all of the constants that these Lua functions need to run. The only things that need to be put in RAM are the genuine R/W data variables. How much Lua code depends on your style / code density, but maybe 3-5,000 lines of Lua is doable. That's a tad more that you can run at the moment. Of course you can still run out of RAM if you don't free up data after you've finished with it, but integer builds will help here.

georeb commented 6 years ago

@TerryE I have had a chance now to look at this in greater depth.

From what I am reading and seeing, it appears to be very reliable! However, I've not been able to sucessfully build the flash img using luac.cross I have run the luac.c file in the luac_cross folder and the command prompt does not give me the option -f as you can see below...

luac: unrecognized option '-f'
usage: luac [options] [filenames].
Available options are:
  -        process stdin
  -l       list
  -o name  output to file 'name' (default is "luac.out")
  -p       parse only
  -s       strip debug information
  -v       show version information
  --       stop handling options

I have read all the related documention, twice, and I am still struggling with this.

As far as I can tell, I need to supply my .lua files for the luac.cross compiler to create an img file that I can then feed into LFS.

What am I doing wrong?!

Please can you perhaps provide a 'luac.cross for dummies' step by step instruction, so that I can use your LFS branch? Really struggling here...

The end goal is to include this img file, incorporated into the LFS, all bundeled into the binary.

TerryE commented 6 years ago

@jmattsson @pjsg @devsaurus @marcelstoer @djphoenix, I have had to add a couple of options to user_config.h, but I am also very unhappy with the understandability of this file. So I am proposing a major reordering of this file and an expansion of the commenting to make it properly self documented so that any developers that need to configure this can read it and understand what needs to be changed.

I haven't changed any of the underlying defaults, but rather just added decent documentary commenting and reordered into a logical order. I can't be bothered to raise a separate PR for this, but just wanted to give you all an early heads-up just in case any of you have any fundamental concerns with this restructuring.

See a draft of this new layout at this gist user_config.h.

marcelstoer commented 6 years ago

Looks great Terry!

I can't be bothered to raise a separate PR for this

Smaller PRs are much easier to reason about and stand a much better change at getting reviewed and merged quickly. I see no reason why this couldn't be fixed independently from the LFS feature (before or after).

georeb commented 6 years ago

Thank you @TerryE

Can someone point me in the right direction to keep the ball rolling at least until the documentation is completed? How can I build these required image files in the interim?

TerryE commented 6 years ago

I see no reason why this couldn't be fixed independently from the LFS feature (before or after).

Because of my lack of competence with git features, mainly, to be honest. I stashed some of the dev-LFS work so I could do that last dev fix, then decided that I wanted to review the stashed changes, so did a show to a /tmp/ patch file and went through this applying them manually. When I was happy, I then dropped the stashes -- only to discover a quirk of the stash drop command -- it scrolls all of the files back to the state that they were at before the stash was taken -- thereby undoing hours of editing. Bugger, bugger, ... One day, I'll get on top of git, but in the meantime it regularly seems to kick me in the nuts.

TerryE commented 6 years ago

Can someone point me in the right direction to keep the ball rolling at least until the documentation is completed? How can I build these required image files in the interim?

@georeb, the source is king; including the make files. luac.cross is made that the app/lua/luac_cross make. This dumps luac.cross in the route directory. if your luac.cross -h doesn't include the -e and -f options then you are picking up an old version from somewhere, or your haven't got LFS enabled in your user_config.h

georeb commented 6 years ago

Sorry @TerryE I'm still not managing to get anywhere :( If you have the time, a line by line list of commands for terminal would be fantastic. Im so frustrated with my own lack of understanding, I can't imagine what you must think?! I've tried everything you've mentioned but still no luck. Thanks in advance.

TerryE commented 6 years ago

@georeb, Ii've just pushed a new update. Have a look at the lfs subdir in lua_examples. I will also update my whitepaper. In due course, a version of this will be added to our documentation FAQ hierarchy.

georeb commented 6 years ago

Thanks @TerryE for that. Just to clarify, I need to be using a flash.img file and not a luac.out file when executing node.flash.reload(imagefile)?

TerryE commented 6 years ago

@georeb, it is going to be easier to do this 1-1, so email me and I can Sykpe / WhatsApp to you. I still have a mailbox with my username at apache org. It's a pity that github doesn't support pms.

Also read my lua_examples/lfs example.

georeb commented 6 years ago

Thankyou @TerryE - I have emailed you...

TerryE commented 6 years ago

We now have 4 users (me, @pjsg, @devsaurus, @georeb) who have tried this out to varying degrees, but enough to know that this works and delivers huge benefits. I've got one more push to tweak a couple of parameters, but I feel we are now at a point where we can sensibly commit this to dev. Comment?

georeb commented 6 years ago

@TerryE - Thank-you for your support with this. What you have achieved with this patch is nothing short of ingenious! It has opened up this device to be capable of so much more! Thank-you for your hard work. I have tested it (probably not to its full extent) over the last few days and haven't come across any glaring issues. IMO, committing to DEV seems to be a sensible move...

drawkula commented 6 years ago

My 1st thought was: Put FTPd and TELNETd in there and nobody will need ESPLORER again. Edit stuff transparently via FTP (lftp has an edit command and some file mangers can do FTP, so pure luxury is near!) and test the stuff via TELNET (I prefer rlwrap nc ...).

Maybe even an editor in there might be doable...

Testing it is on a high place of my todo list.

TerryE commented 6 years ago

My 1st thought was...

LFS allows the development to load whatever is required into the store. This is entirely up to you.

drawkula commented 6 years ago

Sure everyone knowing how to handle this can do whatever she wants.

More space means more ways to help newcomers. If some helpers for access and development would be the default in distributed images, newcomers would have an easier start without looking for 3rd party tools like ESPlorer. That was the direction I was thinking of.

If it's too offtopic here, such thoughts may have sense in an own place for discussion.

TerryE commented 6 years ago

@drawkula, I partially agree with you, in that it makes sense to create a set of LFS-complementary lua examples, and even possibly some standard bundles to help the Lua developer fast-start. However, this is all secondary to getting LFS into dev with cloud builder support.

When I first got involved with NodeMCU, you had 15Kb of RAM at boot, and you could only run toy applications. Johnny added his patch to my ctext into flash, and I added LCD, which made a huge difference to the size of apps that you can run -- if you were willing to us overlay techniques to break your app into small compilation units.

The next big mode shift for me (and when I stopped using ESPlorer) was when I moved to a LuaOTA provisioning model, with all large file compiling done on the host, and this allowed me to develop practical serious Lua applications. I see LFS as just the next step here in this progression. It becomes practical to have Lua applications that are 1,000s line long, or have 40 Kb RAM available for data.

TerryE commented 6 years ago

@cwrseck @devsaurus @dnc40085 @dtran123 @georeb @jmattsson @marcelstoer @NicolSpies @nwf @pjsg A quick update for those that have commented on this LFS patch so far, and also for anyone else interested. This reflects the feedback in PR #2301 and a side email exchange between the committers.

My aim is to add extra commits to cover off (2) - (4) above so that we can do the merge immediately after next master drop. I think that the last item this is going to have to wait until after the merge.

NicolSpies commented 6 years ago

@TerryE , I am working my way through the LFS Whitepaper and will provide feedback. My comment so far is that the changes clarify a lot and provide context from a Lua developer's perspective to provide understanding how it works and how to use it.

Referring to the last bullet item: Do I understand correctly that the long term plan related to the cloud build service could be to create the facility for the developer to upload his .lua application files, specify the LFS and SPIFFS size and the cloud build service will then cross compile and generate the LFS image that can be uploaded to the ESP.

I also like your preferred approach as indicated to keep the module in test in SPIFFS and move it into LFS once stable.

TerryE commented 6 years ago

@NicolSpies, we intend to extend the cloud build service to build standard LFS firmware images with a similar degree of flexibility offered by the current build service for non-LFS firmware. Using luac.cross to build your LFS remains problematic for WinX developers, and #2315 discussed this in depth.

My challenge here is that I haven't used WinX for over a decade, so I can't do this work myself. So I need a Windows developer to do this. :disappointed:

If we go the docker route then I can do the docker side, but the Lua developer would still need to install Docker for Windows which requires Win10Pro, but a number of other providers also supply standard minimal docker servers which run are there in Hyper-V Virtualbox or VMware.

NicolSpies commented 6 years ago

@TerryE, I have followed the Windows 10 with WSL procedure detailed in #2315 by @joysfera with a small correction to save the bytecode output. The process is not too difficult but as a Windows lua application developer I feel as if I am back in DOS command line days. :smile: I also assume to parse a number of lua modules, a script will make it easier but still very "manual".

@marcelstoer, A lua application development environment that would work for me as a crazy, lazy Windows lua application developer spoiled by the current cloud build service would be the following:

TerryE commented 6 years ago

The process is not too difficult but as a Windows lua application developer I feel as if I am back in DOS command line days. I also assume to parse a number of lua modules, a script will make it easier but still very "manual".

*nix programmers tend to spend time in terminal windows because we find it more productive. Dunno about it being manual. make is about as complex as I get. I leave you to work out how to run WSL commands as make targets.

But remember that if you have all of the lua files for the LFS in one directory then you can easily build the entire LFS with a single luac.cross command as per my previous example.

Your suggested develop in ESPlorer/SPIFFS then use some cloud service to blow the files to an LFS image doesn't scale well as you are constrained by the RAM during developing. I would suggest that you will want to move tested code incrementally into LFS and only use SPIFFS for the one or two modules currently under test.

I personally find ESPlorer really slow and clunky, and I hate the editor. It only takes a couple of seconds to remake a 64Kb LFS image in -a mode and blow it down to the ESP at 460,800 baud, and doing this you just work directly on the source on the host. Ditto with my apps that use the LFS version of luaOTA. If I need to change any source files then I just issue a reprovision command and a few seconds later the updated ESP has rebooted and is running the updated application.

NicolSpies commented 6 years ago

Ok, I will stop acting like Garfield the cat resisting a bath. :stuck_out_tongue_winking_eye: My inexperience and ignorance of what nix programmers have mastered shows. I accept the challenge to become more rooted in make and your examples. My development machine is Windows 7 and I do not plan to move to Windows 10 soon, so I will most probably revert to running luac.cross in a nix environment in VirtualBox, as my courage grows I might even venture further. This path might be to much for the pure WinX developer, so the original challenge how to proceed might not be solved.

TerryE commented 6 years ago

Nicol, In many ways you are an excellent example of the typical Lua developer -- except that you also engage constructively with us core collaborators.

I've been working in IT for over 40 years; it's just over 50 since I wrote my first Fortran programs, and in that time I've worked on mainframes, mini's such as PDPs ad VAXes, embedded systems, all sorts of micros and the entire evolution of the PC from the first IBM PC, and I used to be considered an MS guru and I've been to the Redmond campus a few times, but I've also worked with *nix for over 20 years and pretty much exclusively with Linux systems over the last decade.

My strengths are also what makes me useless as a test case for the typical ESP developers who have been brought up on WinX and windows GUIs, but now want to build their own IoT apps. They know WinX and want to learn IoT, and they've probably chosen Lua because this way they don't need to get into the bowels of low level C. It's one thing to ask them to familiarize themselves with PuTTY or Openssh for windows to be able to type in a few commands in the CMD window, but they shouldn't have to spend ages learning a Linux ecosystem to do this.

NicolSpies commented 6 years ago

Thanks Terry, IT for me was always on the fringes as a design tool, means-to-an-end since I started as a young electronic engineer 32 years ago when I graduated. My dream was to one day get closer to embedded development which took place just as the ESP appeared on the scene, so still a lot to learn and so little time. Your description of the WinX/Lua/IoT developer is spot on.:ok_hand:

joysfera commented 6 years ago

That reminds me: with LFS and IMG I'd need to develop a new way of remote upgrading of my ESP devices. Currently I use my own code that downloads new *.lc files using HTTP GET and that has two major drawbacks: first, it does not allow me to update the NodeMCU firmware itself, which is a PITA because every new NodeMCU release fixes some often major issues and I'd love to be able to replace the firmware as well. Second issue, since I have many small lua/lc files (to fit in RAM) the upgrading process needs to download many files and often crashes during that (no idea why, it never happens in my testing, only in the field). With LFS and IMG we should be able to do two great things, I hope: first, we could deliver the upgrade as either IMG (just new lua code) or NodeMCU+IMG (if there was a NodeMCU release in the meantime) and second, we could switch to the SDK supported way of upgrading firmware where the flash is divided into two halves and "bootloader" decides which one to boot from and which one to upgrade. NodeMCU Lua could even support such new upgrading ("provisioning"?) process so that we could all use one common method, similar to say this: http://esp8266.github.io/Arduino/versions/2.0.0/doc/ota_updates/ota_updates.html

TerryE commented 6 years ago

@joysfera Petr, IMO, the main constraint that drives you towards small files is the RAM limitation during the compilation on ESP, simply because the act of compilation itself takes a lot of memory, and this is also related to the size of the module being compiled, so taking the compilation off the ESP and onto the host makes a big difference, as I discovered with my provisioning system.

The second impact is that even though the Lua loader can load large LC files, they still take up memory when the functions are bound to an active Lua variable as a closure.

LFS removes both of these constraints in that the compilation must be done on host, and the compiled code get directly loaded into Flash memory, and my LFS enabled luaOTA package wraps up this provisioning nicely; however it does require Lua to be installed on the host, which a PITA for some developers. If there's any evidence of any other developers using luaOTA, then I will consider a future PR to enable this package to be run through the luac.cross -e option, so that you don't need the additional native Lua installation on your dev host.

OTA upgrade of the firmware itself is another issue and would involve another development sub-project, though IIRC @jmattsson's company DiUS have done some work on this in the past.

In the meantime I am intending to release robust and LFS compatible versions of Telnet, FTP and luaOTA soon.

PS. Lua OTA uses bare TCP exchange the its own host-based server, and this is stable, so I suspect that your instabilities are from bugs in the HTTP module.

joysfera commented 6 years ago

@TerryE Terry, please note that my upgrade system downloads *.lc files (the compiled ones) already. Perhaps with the latest NodeMCU the crashes are gone because I discovered and you fixed the recursive issue corrupting stack back in winter that could be causing it. But I don't know for sure as I don't have any latest NodeMCU in the field yet simply because there's no way of remote updating the NodeMCU firmware, contrary to the ArduinoOTA, for example.

Looking forward to your robust LFS release. I may look into the ESP8266 OTA firmware upgrade in the meantime. I implemented ESP32 OTA year ago in one of my SDK based projects and it worked so great that I'd love to have it for NodeMCU projects as well.

TerryE commented 6 years ago

FYI the two relevant issues for OTA are #806 and #816. No progress in over 2 years.

joysfera commented 6 years ago

Requires 2 MB flash. I've got 1 MB only so I'm stuck with non-OTA firmware upgrades, it seems :)

HHHartmann commented 6 years ago

@TerryE small Typ here. It is item #816 and #806.

@joysfera the developers of ESPEasy have a 2 step OTA mechanism which has an intermediate firmware which is small enough to fit in a fraction of 1 MB and can only load a new firmware which then can be larger than 512 KB again. Maybe that would be a way to go.

TerryE commented 6 years ago

Fixed the typo.

The toggling memory map approach is simple and robust but doesn't easily facilitate multiple memory mapped partitions which is what we have with the firmware + SPIFFS. More research is needed and a job for another day. Definitely outside the scope of LFS.

TerryE commented 6 years ago

This issue is getting a bit long, and the LFS patch has been merged. I am therefore closing this and we will use #2413 to track any post-merge discussions.