Closed colinl closed 1 year ago
Thanks for the report. The flow itself looks reasonable. I will take a look at what's going on.
Alas, I am unable to reproduce the problem. The flow runs normally for me. Let's try to sort out what might be different.
To run your flow, I followed the instructions to run it from the command line. It looks like you are running from the MCU plug-in. That should be fine, but since we are looking for what is different, maybe you could try the command line build, if it isn't too much trouble. I used the -p esp/node_mcu
to match what your build.
There seems to be enough memory. Here's what I see in xsbug instrumentation at the breakpoint you set. If you see anything significantly different, that might be a hint.
You appear to be running with Wi-Fi enabled. This particular flow doesn't actually use Wi-Fi so it can run without that. If you remove the Wi-Fi credentials, does it make a difference?
Firstly, it seems I had some sort of build problem with my test to see if it was ok if I hid the "esp" option in manifest-core, perhaps I had not done the build clean correctly. Now that I repeat that the problem is gone, and I see different code in setupwifi.js. So with that legacy code all is well. I will correct my description of the bug.
Returning the current code, I should have said that if I don't provide wifi credentials then again there is no problem. I will add that to the description.
Another fact I have determined is that if I power up the ESP independently, then it works fine. If I then use Reconnect to xsbug
in the plugin UI, the target restarts and crashes as before. Again, possibly consistent with it being a subtle timing issue.
I will have a look at running it from the command line, if I can work out how to do that. I have only used it from the plugin so far.
A bit more information. By making minor changes to the flow it is possible to make the problem appear or disappear.
However, adding a 1 sec async wait after the line
trace(
IP address ${Net.get("IP")}\n);
appears to have completely fixed this issue. I am able to make changes to the flow without seeing any problems with this code.
Thanks for the additional information and congratulations on finding a workaround.
It doesn't obviously look like a timing issue, but I can't explain your results either. Unfortunately, I can't reproduce them either so I'm kind of stuck at the moment.
I am looking at running it independently of the plugin, but I can't work out exactly how to install node-red-mcu stand alone. It doesn't appear to describe that in the readme, or at least I can't see it.
I have worked out how to build and run it directly, rather than using the plugin, but it says Disabling unsupported node type "PID"! There is some mention of how to include contrib nodes in the node-red-mcu docs but it isn't very clear exactly what I have to do. It says to see the lower_case example, and the only reference to that I can find is in the nodes folder. I tried putting a copy of node-red-contrib-pid there and I see that there is also a manifest file in the lower_case example, so I put in the pid folder a manifiest.json containing
{
"modules": {
"*": "./PID"
},
"preload": "PID"
}
but it still says it is disabling it, so obviously I am missing something. Can you tell me what I am doing wrong please?
I have worked out how to build and run it directly, rather than using the plugin
Excellent!
but it says Disabling unsupported node type "PID"!
Good. That's what it should do when the node implementation cannot be found.
I'm going to assume that the PID note is this one, since it was written by you. Since there are not external dependencies, it looks like it should work. I'll give that a closer look later and let you know what I find.
I was able to get the PID node to register. Here's what I did
$(NODEREDMCU)/nodes/node-red-contrib-pid-master
include
array in $(NODEREDMCU)/manifest_runtime.json
to include the node:
"./nodes/websocketnodes/manifest.json",
"./nodes/node-red-contrib-pid-master/manifest.json"
],
manifest.json
Moddable SDK manifest to node-red-contrib-pid-master
:
{
"modules": {
"*": "./pid"
},
"preload": "pid"
}
I ran your test flow above. In it, the PID node isn't connected to anything so it doesn't do much. But... I can set breakpoints on its calls to RED.nodes.registerType
, RED.nodes.createNode
and this.on('input', function(msg)
and they all work. If you can get that far, you should be able to build a flow that does something a little more interesting with the PID node. ;)
Sorry about not including the node name, you should not have had to waste your time working that out. I cannot believe that I left it out of the description. Yes I know the flow doesn't do anything worth while (apart from flash the led so I can see if it is working). I reduced the flow down in order to make the problem easier to diagnose. I can confirm that with the flow running in this environment it does not fail, but I can also re-confirm that the same flow running in the plugin environment does still fail consistently, unless I have the 1 second delay after the wifi connects. So there must be some significant difference between the two environments. I will report back to @ralphwetzel, I don't know whether he is watching this.
No problem. At least I understand the goal now. ;)
I can try to take a look in the plug-in too. It isn't what I try first, because it isn't as convenient for my usual debugging / development work on the code of Node-RED MCU Edition.
While I have your attention, on another flow, after running for a while I get an exception
Panic /home/colinl/esp/esp8266-2.3.0/cores/esp8266/core_esp8266_main.cpp:131 loop_task
I looked at the code but it isn't obvious what that implies.
When I come across such problems what is the best way to ask for suggestions?
This repository is a fine place to ask since it is being triggered by the Node-RED MCU Edition runtime. If you suspect a more general Moddable SDK issue, the Moddable SDK repository is the place to ask.
The error you are getting in loop_task
is probably a native stack overflow -- the stack guard is corrupted. It may be that your crash on start is also a native stack overflow, which would perhaps explain its intermittent behavior and hard crash.
If you are feeling bold, you can try increasing the native stack. That's a manual edit to esp8266-2.3.0/cores/esp8266/cont.h
to change #define CONT_STACKSIZE 4096
to something bigger. You can't make it too big, or other parts of the system will run out of memory. Try 5 or 6 KB:
//#define CONT_STACKSIZE 4096
#define CONT_STACKSIZE 5120
//#define CONT_STACKSIZE 6144
You may need to do a clean build from there to ensure it picks up the changes. The easiest way to do that may be to manually delete $MODDABLE/build/tmp/esp
I updated the MCU Plugin and tried the flow. Of course, it works just fine. :(
I added some traces to see the native stack use. When connected to Wi-Fi and running the test flow with the PID node installed, the native stack never gets below 1260 bytes free which should be a safe margin.
I am wondering whether the issue is something to do with the way xsbug is invoked, or the way it interacts with the plugin. I am going to do some more tests later today.
I think I am on the right track. If I build and deploy it in the plugin then it goes into the start, attach to Wifi, crash, restart loop. If I then close down the xsub window then after a few seconds the target restarts and runs normally (the flow flashes the LED so I can tell that). So there is nothing wrong with the target code itself, it is to do with how it being run.
Also, if I go to the project directory in the plugin and run the mcconfig command from script file that the plugin generates, that is
mcconfig -d -x localhost:5004 -m -p esp/nodemcu ssid="***" password="***"
then it crashes as before, but if I run
mcconfig -d -x localhost:5004 -m -l -t deploy -p esp/nodemcu ssid="***" password="***"
so that it does not start xsub, then it runs as it should.
I have moved this to a new issue on the plugin, since it only affects that environment. https://github.com/ralphwetzel/node-red-mcu-plugin/issues/17
Sounds good. If there's something more I can try, please let me know. Closing this one for now.
Using @ralphwetzel/node-red-mcu-plugin installed from git #cf57d3d7f0ebac26285bba41b2148d60cbb32446, building a simple flow for a Wemos D1 Mini (configured in the plugin as a Nodemcu), the target crashes after connecting to the wifi.
In node-red-mcu/setupwifi.js in the code
If I put a breakpoint on the line
done?.()
it breaks ok and if I step then it moves forwards to the next line, and if I let it run then the flow runs correctly. However, if I put the breakpoint instead ondone = undefined
then it shows the IP address in the log but then immediately crashes and restarts. At the suggestion of @ralphwetzel in this forum thread I hid the esp entry in @ralphwetzel/node-red-mcu-plugin/node-red-mcu/manifest_core.json so that it used legacy code, and rebuilt, but the behaviour was the same.This seems to happen if I have any contrib nodes installed, but without those it is ok. This simple example uses node-red-contrib-pid