nodemcu / nodemcu-firmware

Lua based interactive firmware for ESP8266, ESP8285 and ESP32
https://nodemcu.readthedocs.io
MIT License
7.66k stars 3.12k forks source link

weirdly disappearing files in NodeMCU 3.0 #2948

Closed joysfera closed 3 years ago

joysfera commented 4 years ago

Using nodemcu-uploader.py with --verify sha1 -c to upload a lua source code 100% safely, and compile it on the ESP8266 itself.

Expected behavior

When a file is uploaded it gets compiled and stays in SPIFFS.

Actual behavior

Random. Sometimes the uploading fails with wrong SHA1 hash. Sometimes the file gets uploaded and compiled without any errors but then when I want to run it it's missing.

I've seen so many weird issues in five minute testing of the 3.0 release that I went back to proven 20181207 release immediately.

Sorry, I don't have time now to dig into that and try to find a pattern or a repeatable test case. It seemed completely random and when files started disappearing from SPIFFS I just gave up.

So this is one of my worse bug reports. Feel free to close it, I just wanted to give you some heads-up even if I cannot provide perfectly reproducible step-by-step documentation.

Hardware

ESP8266-07

TerryE commented 4 years ago

@joysfera Petr, as far as I am concerned, you have enough karma with this project for your issues to be taken seriously, and it sounds like an issue that needs fixing. I am up to my eyes at the moment in doing the ESP test cases so I don't want to break off at the moment.

If you do find time in the next week or so to focus down on the behaviour or even give me a simple test case that sometimes fails then this would be a good starting point for me to track this down and I would be grateful. Thanks Terry

Might be related to #2943

TerryE commented 4 years ago

@joysfera, see the referenced nodemcu-uploader.py issue. Having considered this, it might be an issue relating to the synchronisation of the python script and the SDK 3.0 runtime. I am interested in the maintainers views here.

I introduced piping of stdin and stdout in SDK 3.0 basically because in SDK 1.x and 2.x it was just so easy to overrun the UART input FIFO and the blocking output FIFO could easily trigger watchdog timeouts.

Of course you are free to carry on using the pre-piped stdin/stdout with the SDK 2.x releases, but another workaround might be to consider using LFS and putting FTPserver in your LFS image. I just find spinning up an FTP server and using drag and drop an easier way of bulk provisioning of files, though I rarely bother to compile code on the ESP these days, as I prefer the decent error tracebacks that I get if I leave the debug info in (which isn't really an issue with LFS).

PS another way I provision short files is to paste it into a file putcontents call: file.putcontents('somefile.lua',[==[^V]==])`. This is usually solid fo files up to ~100 lines or so.

joysfera commented 4 years ago

That reminds me that the nodemcu-uploader.py does not work on NodeMCU-ESP32 at all, so I spent several days by patching it. I got it almost working but then I gave up, can't recall the last issues. Perhaps they were the same as with this NodeMCU 3.0 for ESP8266?

How do you all upload your Lua programs to NodeMCU 3.0+ nowadays? The nodemcu-uploader.py is very fast and has SHA1 checksum control, that is crucial. I don't know about any other such tool for NodeMCU.

As for LFS: @TerryE , I've got many NodeMCU devices with older NodeMCU versions deployed in the field and when developing firmware I need to keep "backward" compatibility with them. As I didn't figure out a way of having both LFS and non-LFS Lua source code at the same time I stick with the old non-LFS approach, even though I suffer from severe low memory issues.

Remember that for non-LFS you are forced to split your logic into the smallest blocks of Lua code possible that you keep loading and unloading (I use dofile() for that). It is rather slow, by the way. LFS is quite the contrary, AFAIUI - a monolithic block of code that does everything at once. I would love to switch to LFS but I am not feeling like developing two separate versions of the same Lua source code - a LFS and old non-LFS one.

TerryE commented 4 years ago

nodemcu-tool is a Node.js app that will run happily of Linux and Windows. It's not wonderfully fast but OK. It's also under active maintenance and the guy has made some tweaks to support SDK 3.0. The SHA-1 stuff is easy to implement as a Lua fragment, though any CRC is probably good enough if all you are wanting is data integrity.

I do remember all about JiT loading though I tended to wrap everything inside an object and use an __index method to autoload ephemeral methods. Still I am very glad to have left that all behind

You can run LFS happily on 1Mb devices such as the 8285 and AFAIK everything has come with 1Mb min for at least the last 3 years.

joysfera commented 4 years ago

@TerryE I cannot update my application to LFS because all the already deployed ESP8266 nodes scattered across the globe are running on old NodeMCU versions and will not get new LFS one (OTA update for NodeMCU firmware is missing, apparently). I can either stop supporting them or be forced to keep two separate Lua source trees - a LFS one and old non-LFS one. It might be simpler for me to rewrite everything to C.

Thanks for the nodemcu-tool, I'll give it a try.

TerryE commented 4 years ago

@joysfera Petr, I am thinking of closing this one because it is very difficult to fix a symptom without a clear test case, however I will wait for a few weeks to give you time to try 3.0 + nodemcu-tool and give feedback.

HHHartmann commented 4 years ago

@TerryE Terry please see my other comment for a test case. https://github.com/nodemcu/nodemcu-firmware/pull/2912#issuecomment-550029018 Listing the files with ESplorer works but the file cannot be found when executed as it seems. So the problem might not lye within SPIFFS.

TerryE commented 4 years ago

I will look at this also.

poorandunlucky commented 4 years ago

I've had this issue too, files appearing and disappearing from SPIFFS... I can't remember if I fixed it or not, I was trying to learn the Non-OS SDK for the past little while, but I can add my voice that the problem does exist...

matsievskiysv commented 4 years ago

@joysfera

That reminds me that the nodemcu-uploader.py does not work on NodeMCU-ESP32 at all, so I spent several days by patching it. I got it almost working but then I gave up, can't recall the last issues.

Could you share your patches?

joysfera commented 4 years ago

That reminds me that the nodemcu-uploader.py does not work on NodeMCU-ESP32 at all, so I spent several days by patching it. I got it almost working but then I gave up, can't recall the last issues.

Could you share your patches?

I could (after forking the upstream and cleaning up my patches) but I've noticed the upstream author of nodemcu-uploader.py had no time updating it, and at the same time Terry suggested to use a different, more up-to-date tool (even though less advanced) so maybe we just need to switch to nodemcu-tool (still haven't had time to give it a try).

matsievskiysv commented 4 years ago

I miss some functionality of nodemcu-uploader. In particular - autorenaming files in folders, needed for https://github.com/marcoskirsch/nodemcu-httpserver. It would be a shame to throw away nodemcu-uploader without trying to fix it.

poorandunlucky commented 4 years ago

I also am not fond of nodemcu-tool for some reason, also I just wanted to say that I never had problems running files that were supposedly not there and that files would come in and out of appearance during repeated listings using nodemcu-uploader, so the problem is quite possibly just limited to the tool and the firmware might be fine (not sure what the issue is exactly, but since few people complained about it I just wanted to share my experience),

joysfera commented 4 years ago

My application on SDK 3.0 failed to run dofile("something.lc") after something.lua has been uploaded and compiled on the device successfully, so it was not just nodemcu-uploader not seeing some files.

TerryE commented 4 years ago

Guys, I seem to be the only developer here who is sufficiently to grips with gdbstub and elf-gdbto track down errors, but I do need hard test cases to work with. As @marcelstoer suggests in his issue template some Minimal, Complete, and Verifiable test case. Once you go into the debugger, then all interrupts, Lua interaction, etc stops. At best you try to step though a Lua fragment and the API calls in its execution until you narrow down and locate the bug.

Petr, I have managed to create a fail case with an LC load and I'll track that down tomorrow.

joysfera commented 4 years ago

Terry, it's great that you've managed to reproduce it. I thought it would stay a ghost forever, mainly because I didn't provide any hard test case. Don't rush with tracking it though, don't get interrupted from your Lua53 work - that's more important and exciting, I believe.

HHHartmann commented 4 years ago

That was exactly the test case I linked to above

TerryE commented 4 years ago

@TerryE Terry please see my other comment for a test case. https://github.com/nodemcu/nodemcu-firmware/pull/2912#issuecomment-550029018

@HHHartmann Can you double-check this reference and correct. I don't understand the link you provided.

TerryE commented 4 years ago

My application on SDK 3.0 failed to run dofile("something.lc") after something.lua has been uploaded and compiled on the device successfully

@joysfera Petr, I've tracked this one down. My bad. I had to back out some of the dead eLua stuff as part of unifying the Lua 5.1 and 5.3 library APIs and missed an edge case which broke the binary load function. I've fixed this now and the other known issues that you, Gregor, etc. pointed out so I'll do some more pushes to the 5.3 release.

HHHartmann commented 4 years ago

@TerryE Terry please see my other comment for a test case. #2912 (comment)

@HHHartmann Can you double-check this reference and correct. I don't understand the link you provided.

@TerryE oops, No idea how I messed up that one. Links are corrected now.

joysfera commented 4 years ago

@TerryE great job! Thank you a lot. So this issue can be closed now.

matsievskiysv commented 4 years ago

@TerryE

I seem to be the only developer here who is sufficiently to grips with gdbstub and elf-gdbto track down errors

Is https://nodemcu.readthedocs.io/en/master/modules/gdbstub/ accurate? It seems that files in bin/ have debugging symbols stripped. I was able to run debugger only when I issued elf-gdb app/.output/eagle/debug/image/eagle.app.v6.out.

TerryE commented 4 years ago

@seregaxvm, see the .gdbinit in the nodeMCU root directory. You will need to add an add-auto-load-safe-path config item to your home .gdbinit to autoexecute this.

You also need to do a make with DEBUG=1 or do the make straight make, trash the app/lua/.output and app/lua/luac_cross/.output hierarchies (and any other directories that you need full debug info for) and remake with DEBUG=1 and that way you only get -O0 -ggdb for the files that you might want to debug.

kmpm commented 4 years ago

To anyone out there. I'm the maintainer of nodemcu-uploader and I have started to work on this issue as well as esp32 support. Have some patience please.

HHHartmann commented 4 years ago

@kmpm There has been an issue with uploading of certain characters (\0 and \255) described in #2963. Maybe that solves this problem also.

joysfera commented 4 years ago

@kmpm are you interested in my copy of the nodemcu-uploader where I almost got the ESP32 working?

kmpm commented 4 years ago

@joysfera, very much so. Would love to.

chilipeppr commented 4 years ago

I'd love to see it too

On Mon, Dec 9, 2019, 10:41 AM Peter Magnusson notifications@github.com wrote:

@joysfera https://github.com/joysfera, very much so. Would love to.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/nodemcu/nodemcu-firmware/issues/2948?email_source=notifications&email_token=AB4J23N57LV5V2XPWGMILPLQX2GPNA5CNFSM4JC4JMMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGKGPPA#issuecomment-563374012, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4J23IVYEGH3QAYOCI2LP3QX2GPNANCNFSM4JC4JMMA .

joysfera commented 4 years ago

@kmpm I have just spent some time searching for it but I cannot find the folder where I had that work-in-progress. Damn. I remember that the last issue I was facing was sha1 incompatibility - the sha1 in ESP32 returned different hash than the normal (PC) sha1. Other than that I had everything updated, and there were a lot of changes for the ESP32. Your uploader is superior because it is very fast, has the verify option and I use the download feature as well (because I don't trust the desktop luac, so I always upload pure lua files to NodeMCU (with sha1 verification that the file has transferred 100%), let it compile there and then download it back to my PC. If Terry reads this he probably pulls out his hair :) Anyway, sorry, I have either deleted it or stored it to a so secret place that I cannot find it right now. Oh well :-(

kmpm commented 4 years ago

@joysfera, "stuff" happens :-) . If you had the code then great but I will not burst into tears. On a sadder note, I can't replicate your initial issue. Do you upload many files at once or something? If so how many and rough total size?

joysfera commented 4 years ago
$P upload --verify sha1 -c wifi.lua ds18b20.lua wificonfig.lua temperwebserver.lua getpost.lua http_req.lua http_reply.lua webs_data
_xml.lua webs_data_txt.lua webs_data_json.lua webs_status.lua webs_config.lua webs_system.lua main.lua mqtt.lua hwconfig_$VERSION.lua:
hwconfig.lua updater.lua httpDL.lua version.lua hwinit.lua measure.lua relay.lua action.lua units.lua cfg.lua
if [ $? -ne 0 ]; then
    echo "Upload failed!"
    exit 1
fi
sleep 0.5
$P upload --verify sha1 s/* init.lua
if [ $? -ne 0 ]; then
    echo "Upload failed!"
    exit 1
fi
echo "Success!"

Total length of Lua files is 97 kB.

joysfera commented 4 years ago

@chilipeppr @kmpm turns out I worked on the nodemcu-uploader for ESP32 on another computer than my normal one, so I have just found the modified source code. Will pack it and upload somewhere for download. I'm not going to fork the repo and commit it, I don't think my changes are of high enough quality.

EDIT: here it is, in a temp folder because I believe it'll be obsolete soon as Peter updates his repository for ESP32 compatibility: http://joy.sophics.cz/tmp/nodemcu-uploader-ESP32.zip

kmpm commented 4 years ago

@joysfera could you test the latest git version of uploader on a 8266. Have made some changes that should help. I have not been able to reproduce your issues any more with firmware 3.0-master_20190907 Haven't got to ESP32 support yet but thanks for the code

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.