Closed Jendem closed 2 years ago
To be sure I understand: You can go back to v2.0.0 and it does not crash? I am running an old Kamstrup module (like yours), and do not have this issue.
Yes, v2.0.0 is ok, the newer ones crashes at startup. Maybe related: I cannot set static IP in v2.0.0, when it restarts, it goes into AP mode and all setting are lost. (Haven't looked into this with serial debugging)
I successfully upgraded to 2.0.1, but now I have lost access to device and a reset doesn't seem to get it back online.. So I guess I might have the same issue with crashing. I did have static ip configured. It doesn't look like it connects to my wifi at all and it doesn't look like it is AP mode
Strange... I use static IP, and have not seen any issue with the upgrade. (I skipped 2.0.1, went from 2.0.0 to 2.0.2 but cannot see how that should impact this). @NicolaiPetri : Do you have what is needed to reflash it by cable (see user manual chapter 3)? I'm afraid that is the only option if the ESP is bricked.
Unable to reproduce this. Considering how many Kamstrup users we have, I think this must be related to configuration. Erase flash, reflash latest version and configure one thing at the time and see when it breaks.
Everything is working after erasing flash first!
python esptool.py --port "COM8" write_flash --erase-all 0x0 firmware.bin
Everything is working after erasing flash first!
python esptool.py --port "COM8" write_flash --erase-all 0x0 firmware.bin
@Jendem
esptool.py --chip ESP8266 --port <e.g. COM3> erase_flash
@gskjold Maybe the flash command examples in the Wiki should be updated by adding the Erasing Flash Before Write option? It could make the reflashing process more robust.
Yes, if you run the erase operation as a separate command yours is right. But you can combine it, as your reference Erasing Flash Before Write says.
I added a script found here to inject --erase-all
always.
Import("env")
old_uploaderflags = env["UPLOADERFLAGS"]
index_write_flash = old_uploaderflags.index("write_flash")
if index_write_flash != -1:
new_uploaderflags = old_uploaderflags[::]
new_uploaderflags.insert(index_write_flash + 1, "--erase-all")
env.Replace(UPLOADERFLAGS=new_uploaderflags)
env.VerboseAction("$UPLOADCMD", "Uploading `$SOURCE")
The full command from platform io is
python esptool.py --before default_reset --after hard_reset --chip esp8266 --port "COM8" --baud 115200 write_flash --erase-all 0x0 firmware.bin
Never tried OTA, only by serial.
Started from version 1.4.1
Never tried OTA, only by serial.
Thank you for that information. Most users will update OTA, so it was important to clarify that your problems were not linked to that.
I think I have the same issue. My module crashes now and then. It works again for some time if I remove the module from the meter and wait some time before reinserting. Do I have to connect to the serial port to find out why it crashes?
I have a POW-K, using 2.0.2 from Github.
Do I have to connect to the serial port to find out why it crashes?
Maybe difficult to catch it while it happens, but you can activate telnet debugging in menu System/debugging:
Then open a command window on PC and use command telnet <IP address>
My issue seems to be that it drops of the network, as there are hourly data from the period while it was offline. So maybe not a crash but some wifi issue. Do the ESP have space for logging or will that burn out the flash?
My issue seems to be that it drops of the network, as there are hourly data from the period while it was offline. So maybe not a crash but some wifi issue. Do the ESP have space for logging or will that burn out the flash?
You can see if it has crashed and restarted by the uptime counter. I can see now that I too have a restart issue:
It is not a big problem for me, but @gskjold will surely look into this.
The data points for the graphs are calculated from the whole-hour List 3 datagrams when the meter reports accumulated consumption (kWh's) and stored in flash memory. So a reboot will not cause it to lose graph data points unless it happens at that moment when the meter sends List 3.
My unit was offline the entire night, there are big gaps in my graphs. When I disconnected and reconnected it, it came back online. As the hourly data was recorded one can assume that it was up and running in the period with missing data.
Mine has also restarted now, I believe it was up for 4 days. Looks like a very short restart, cannot see any gap in my database. (Kamstrup 10 sec interval)
And now i crashed again, at uptime = 313603 seconds. Logged data:
Edit: I'm running 8751b6325d09a5a8a204149b2688d05d78a70319
Very interesting. Are you all on Kamstrup?
Good observation! Yes, all (@Jendem, @bardahlm and myself) that have reported the issue here so far are on Kamstrup, using some version of Pow-K.
In addition to upgrading, I moved from one Kamstrup to an other recently, in parallel with upgrading to v2.x.x. So I cannot say if it is the upgrade or the moving to a different Kamstrup meter that is the reason for this. I never had this issue on my previous location (with earlier firmware versions). So there is a possibility that this could be linked to some issue on individual meters (like Vout dropping out for a short period), causing a restart.
If this is the case, it should be visible on the Vcc reading just before the restart, as the supercap in Pow-K will hold the voltage up for a while (but dropping) even if Vout from the meter has dropped to zero. However, the above logging by @Jendem confirms that Vcc is stable during the restart - so this hypothesis seems incorrect.
I really don't see any other Pow-K HW related phenomena than loss of input voltage that could explain a reboot.
Are there any users on Aidon or Kaifa that have seen this?
Ideas on where to look are welcome!
Could be newly added data parser in v2.x series firmware. Will have a look when I have time.
I am having severe rebooting issues on Kaifa, running AMS reader 220103.7.
Would like to downgrade to v. 2.0.0, to find if it stabilizes on that version. Do I have to completely erase the entire flash chip and reconfigure to avoid problems with the existing config files?
Reflashed with the same version as before (220103.7), but with complete erasing of chip. Configured with the same values, and awaiting uptime logging to see if it helps.
If it doesn't work, try 220105.2: esp32.zip esp8266.zip
I doesn't work, so I am intalling 220105.2 now.
Still the same with 220105.2
Just to recap this tread:
I found one thing that may cause reboots, new firmware file below. esp8266.zip esp32.zip
Are the zip files you upload based on master or on uncommitted/unpushed stuff?
Uncommited
Still getting reboots with version 2.0.3. I am not sure if 2.0.0 works better, as I never have tried that version.
Installed 2.0.0, and it runs without rebooting. All newer versions I have tried have random reboots.
Hardware information:
Relevant firmware information:
Can you also confirm that the problem starts with v2.0.1 ? I'm trying to narrow my search... What MQTT payload?
Sure, will install it now. Are running JSON payload.
I was running 8751b6325d09a5a8a204149b2688d05d78a70319 until i updated to 9897ccc56390da4429b66e498f69a15af5a74287 at the vertical blue line (last 7 days shown):
I also use JSON payload
Could confirm that both 2.0.0 and 2.0.1 are running without reboots.
Are you sure about that? My reboot problems occurred after >2 days uptime at 2.0.1
@Jendem JSON payload? Any temp sensors?
Yes, JSON payload and no temp. Would it be an idea to test deactivating MQTT and instead try to poll http://ams/data.json?
Maybe, I am at a loss here, so any tests you can do is useful. You say 2.0.1 reboots after 2 days, does 2.0.2 reboot more often?
@Jendem Not sure, but went from multiple reboots each hour, to running 13 hours (for now) without rebooting. So at least more stable. I will continue on 2.0.1 for a while.
v2.0.1 are older than https://github.com/gskjold/AmsToMqttBridge/commit/8751b6325d09a5a8a204149b2688d05d78a70319
EDIT: Are running with debug level ERROR.
Mine is so far stable with 2.0.3 (Pow-K, UART0, no MQTT yet).
Question to @gskjold : Is deactivating Telnet enough, or must the debug mode also be set to "Error"? I believe you suggested somewhere doing both. I have had debug disabled for a while, but changed mode from "Debug" to "Error" on Sunday (when i changed to 220109.1). Before setting mode to "Error" I had unexpected restarts. Nothing since changing it to "Error".
Might be a coincidence, might be a clue...
Not coincidence, setting level to error makes a difference.
Not coincidence, setting level to error makes a difference.
OK; then this could potentially be important for those having reset issues, as many of us has run debug and deactivated it later.
Just to be sure I understand: Even if both Serial and Telnet debug is deactivated, the "Debug level" setting can make a difference?
If so, it could maybe be useful in a future update to implement the following: When both Telnet and Serial are disabled: Set "Debug level" to "Error".
Correct. And agree
Have successfully used 2.0.1 for 3 days without rebooting. The first 2 days with debug level ERROR, the last day with debug level DEBUG. Both works as expected.
Will try to upgrade to 2.0.4 now.
2.0.4 started rebooting from the beginning. Debug level is ERROR, and it reboots many times each hour.
A summary of this thread for me: v2.0.0 and v2.0.1 is ok, v2.0.2 and above is not. Which means that the change must be between from v2.0.1 to v2.0.2. There is not much change between those two versions that could only affect a handful of people... https://github.com/gskjold/AmsToMqttBridge/compare/v2.0.1...v2.0.2
Attaching a firmware where I have downgraded Timelib from 1.6.1 to 1.6.0. I have my doubt it will change anything, but worth a try.
I really appreciate your efforts to help me find the problem @gskjold!
I have managed to build and upload myself now.
In order not to waste your time, I will try to step up commits until I find out where the error was introduced.
At the moment I am running ff02dd4.
v2.0.4 running for two days with no issues.
I have found a possible problem, attaching new firmware.
EDIT: Sorry, constantly attaching wrong file, adding new one! esp8266.zip
I've been using this software with a nodemcu for about a week now. I've had frequent issues with 2.04 and 2.05. So far, fix 220122.2 has been running without hickups.
edit Unfortunately 220122.2 crashed aswell, just took a little longer. I'm now running 2.0.0.
Crash after flashing using bin file and compiling latest source.
Relevant firmware information:
Same log for both versions: