Closed sstidl closed 8 months ago
I have exactly the same problem. The WIFI connection is very bad with the new version, every few seconds there is no connection.
Are you using MQTT? Is it more stable with the MQTT host left empty?
Oops, for me the ESP8266 is dead now. PowerOn reset doesn't solve/improve the issue. No ping reply for the last minutes after upgrading from 0.6.0 to 0.6.9. :-( Seems, that I'll downgrade to 0.6.0 by USB connetion again and keep on this stable version for the next months. ;-)
Are you using MQTT?
Yes, MQTT is the most important feature for me. ;-)
I'm too using mqtt.
Mqtt is sending data but webui is not usable and so many pings lost.
But as I said, the most important question is, why touching the antenna makes a difference?
Does the ping situation improve when you set an empty MQTT host? I'm trying to find out whether the problem is related to the MQTT code. I don't know why touching the antenna makes a difference.
After deleting the MQTT host settings I didn't observe any WiFi/ping issue "Zeitüberschreitung" (timeout) are the missing replies during restart (flash process 0.6.0 => 0.6.9).
Strange, after configuring the MQTT part with the same values Ahoy is reacting as aspected and I can access the UI. :-) Please keep in mind that I didn't have any issue by touching the antenna as my ESP8266 is in a housing. ;-)
Same here, problem disappears when disabling MQTT and DTU stays working fine when re-activating MQTT then. Still monitoring, haven't found out yet 100% what's going on. Hope we learn more here soon as this is a showstopper
At this point I'm not sure why it's behaving that way, but I have seen a few reports linking it to MQTT, which is why I asked about diabling it. Your results seem to confirm that it has to be something to do with the MQTT code. Glancing over the commits leading up to 0.6.9 I only found something related to an MQTT subscribe action.
If I were to guess, I'd say the MQTT code gets stuck in a loop somehow and eats up most of the CPU time.
Did some further testing here as well after reporting issues with WiFi on #882 Thread during Dev already Now, did move the DTU closer to the HM's (5m) and changed sending power from Max to Low. TX retransmits are still very high / same (3985 retreansmitts while 4514 TX count) so this is much worse than with 0.6.0 and was also noticed in #906 Also, did move a WiFi repeater closer to the DTU so the DTU has now -43 RSSI. Voila the DTU is now running rock solid for 24 hours already sending Data via MQTT and handling limit changes via MQTT on the fly with 0.6.9 Based on the comments from @lumapu and @beegee3 in #882 my assumption is, the ESP is busy with retransmits on the NRF and doesn't have much time anymore to handle Wifi. Now, if Wifi isn't in a 'perfect' state requiring retransmits as well, the ESP is running wild as it's unable to handle buffered data for MQTT anymore. Still running MQTT and still running with 5 Inverters so the setup hasn't been changed on that side. So, the underlying reason for the 'hickups' now could really be something that has changed in the NRF24 space clogging everything else. To be confirmed by the experts, will let you know if anything changes here
zwei logs (minicom -D /dev/ttyUSB0), die das instabile Wifi zeigen:
die DTU liegt immer an derselben Stelle, rssi ist zwischen -40 und -60, mal steht die Verbindung über Stunden, mal kommt keine zustande
mir fiel auch auf, daß ein AP meistens als SSID AhoyDTU aufgebaut wird, manchmal aber auch
als ESP_
Vielleicht hilft das ja irgendwie.
Ich würde euch auch gerne mit meinen Beobachtungen unterstützen, denn ich habe das gleiche Problem. ESP32, ext. Antenne, Ahoy-DTU steht ca. 1,5m von der Fritzbox entfernt, ca. 4m vom Inverter. Verbindungsabbrüche zu MQTT, WebUI und Inverter. Verhalten trat sporadisch mit 0.5.66 auf, mit 0.6.12 ist keine beständige Kommunikation mehr möglich. Mein AhoyDTU hängt an einer Schaltsteckdose. Es fällt auf, dass der normale Verbrauch von 0,7-0,8W sehr konstant ist, wenn der Ahoy die Verbindung aufgebaut hat. Bei Verbindungsabbrüchen erhöht sich der Verbrauch auf 1,3-1,4W, ich vermute er geht auf Volllast.
Das AhoyDTU Projekt ist echt super, vielen Dank für die tolle Arbeit. Ich hoffe meine Beobachtungen helfen bei der Fehlersuche.
Nimm die Leistung vom NRF mal zurück auf Low, und stell das Polling Intervall der Inverter auf 5 - das hat bei mir das Problem gelöst. Ahoy ist seit 4 Tagen up und running ohne Aussetzer mit MQTT etc
Sehr cooler Hinweis, vielen Dank.
Das hat tatsächlich sofort einen Unterschied gemacht.
MQTT ist jetzt ohne Unterbrechung verbunden.
Bis heute früh hat es stabil funktioniert. Ohne irgendwelche Änderungen tritt das Verbindungsproblem wieder auf. Zum Teil waren mehrere Versuche nötig, bis der AHOY-DTU wieder gestartet ist. Hat jemand eine Idee woran das liegen könnte?
Ich habe für mich leider aus den Instabilitäten die Konsequenz des Downgrades ziehen müssen. Seit Dienstag nutze ich wieder sehr zufrieden die 0.6.0 die zuvor bereits >4 Wochen ohne Aussetzer und Auffälligkeiten lief.
@nexulm: läuft die 0.6.0 tatsächlich stabiler? Hast Du auch eine Steuerung des Limits über MQTT aktiv? Die 0.6.12 hat es jetzt 5 Tage geschafft und jetzt ist das Problem wieder da. Als ob ein Buffer voll laufen würde…
@derTillus: Wie geschrieben lief die 0.6.0 bei mir über Wochen (>4) stabil ohne Aussetzer und Eingriffe bis zum Update auf 0.6.9. Diese lief bei mir auch unter Beachtung einiger Tipps aus diesem Ticket nie stabil über eine längere Zeit (<=2 Tage). Da ich nun zur weiteren Optimierung meines WLAN-Funkqualität einen WiFi-Router in der Nähe der Ahoy-DTU verwende habe ich gestern auf 0.6.12 aktualisiert. Hintergrund: Der WiFi-Router wird immer über Nacht ausgeschaltet, da die Ahoy-DTU bei dann eh keine Daten vom Wechselrichter empfängt und sendet. Gestern morgen hat die DTU mit 0.6.0 allerdings keinen WLAN-Reconnect geschafft, sodass ein PowerOn Reset durchgeführt werden musste. Mit der 0.6.12 hat es heute morgen dann funktioniert. Mal sehen ob es mir in 5 Tagem dann ähnlich ergeht wie dir mit der 0.6.12!?!
UND: Nein ich habe keine Limit-Steuerung über MQTT. Die PV soll mir alles liefern was geht. ;-)
Meine Beobachtung: Auf mehrere "jungfräuliche" ESP32 und ESP8266 0.6.9 aufgespielt und die funktionierten auf Anhieb und stabil. Einen ESP32 von 0.6.0 auf 0.6.9 upgedated und der spielte komplett verrückt. Auch ein neues Aufspielen per Kabel half nicht. Mit vorherigem Löschen und vorherigem Reset auf Werkseinstellungen ging es dann mit 0.6.9 per Kabel. Bei diesem hatte ich mit MQTT herumgespielt.
Vermutung: Es bleiben irgendwo beim Update ein paar Bytes im Flash, die dann vom Update genutzt werden und wenn diese (mit MQTT-Einstellungen?) vorher beschrieben waren, gibt es Probleme beim Update.
Hello,
first of all, I want to express my honest respect for that work. Please don't rate this as complain but as field report or bug report. Hardware: Wemos D1 mini and NRF24L01+ PA , later enhanced with 1.3" OLED.
I've started some weeks ago with 0.5.66 (I suppose, it was first half of April), installed via Web-Installer. No display back then, run rock solid. No connection issues noticed.
A few days back I added the OLED and therefore went to 0.6.9. Here the issues started.
Misc/Trivia
Let me know, if I you need logfiles (and pls. tell me, how to access them, while connection is unreliable). Otherwise I would downgrade now to 0.6.0 and observe.
Hi habe das selbe Problem denke ich. Die DTU läuft den tag über gut irgendwann fängt sie jedoch an nicht mehr mit dem WR zu kommunizieren und sendet über mqtt nurnoch den Status (0) den Tagesyield, gesamt yield, ip, etc aber keine Leistungsdaten mehr obwohl die Anlage noch produziert... nach einem reboot der DTU geht alles wieder.
Gleiches Problem, wenn MQTT dann kein Ping und kein WebUI. Trifft v0.6.9 und v.0.7.2. Es werden aber Daten per MQTT abgeliefert, das seh ich in openHAB.
Since my last post (https://github.com/lumapu/ahoy/issues/901#issuecomment-1549262622) I run 0.6.9 without any entries for MQTT. Much more stable, over days no problem. Nevertheless, at least once (indeed after several days of runtime) I had to perform a cold reset.
Mine runs perfectly now with 0.6.9 incl. MQTT and adjusting the limits on all 5 inverters during the day
Mine runs perfectly now with 0.6.9 incl. MQTT and adjusting the limits on all 5 inverters during the day
How often du you send via mqtt? And did you change any pins? Is the nrf power on low? And last esp8266 or esp32? Sry many questions...
MQTT Interval 0 (whenever there is a change in the values) No Pins changed NRF Power is Min ESP32 Please see my comments above, where i had the same issues like you before. My game changer was to set the polling interval of the Inverters to 10s and NRF Power to Low. Since then i'm fine!
⁸
MQTT Interval 0 (whenever there is a change in the values) No Pins changed NRF Power is Min ESP32 Please see my comments above, where i had the same issues like you before. My game changer was to set the polling interval of the Inverters to 10s and NRF Power to Low. Since then i'm fine!
Okay thanks i am going to try! Currently polling os at 30 s and i am running an esp8266...
I had the same issue with 0.6.9. Very unstable on a Wemos D1 Mini esp8266, no pings, when MQTT was enabled. It ran fine when disabling MQTT.
I flashed 0.7.3 this morning and it ran fine the whole day with MQTT enabled. So some change in 0.7.3 seems to fix the issue.
Might be in commit https://github.com/lumapu/ahoy/commit/4e54bcf2994fe3ccfccfd9dab15e935bb7337bdf (fix MqTT publishing only updated values https://github.com/lumapu/ahoy/issues/982).
I tried 0.7.6 and the problem is still there. When I disconnect the NRF from my NodeMCU it is reachable via Ping/WebUI but with NRF it stucks.
0.6.0 runes fine, also the Ping seams to be lower then with 0.7.6.
With my esp8266 + 3 inverters, mqtt now works with 0.7.5
Sorry for not writing so long. I tried to go to 0.7.22 via web update on my esp8266 today and lost communications again. It's the same thing as I described in the initial post. I tried to disable mqtt by deleting the IP of the server but it didn't change anything. Tried to downgrade to stable 0.6.9. Now it's dead. Need to flash via serial tomorrow.
LG Stefan
ok, it looks good now...
what i did was:
using esptool.py (https://github.com/espressif/esptool) I erased the esp:
python3 esptool.py erase_flash
(you have to put it in bootloader mode by holding buttons on the esp board)
then flash it
python3 esptool.py write_flash 0x0 230804_ahoy_0.7.23_3a944d1_esp8266.bin
look what it does with
screen /dev/ttyUSB0 115200
(exit screen with crtl-a k)
connect to access point AHOY_DTU password esp_8266
reconfigure it
now let's see if it's running stable
I can give a short heads up: Watchdog and exception reboots Had serial logger attached, logs didn't show any useful information
Upgraded to 0.7.26 in the evening (no sun) Today in the morning device didn't ping anymore, had to disconnect power
Reboot because of Hardware watchdog at about 15:30
So no, it's not stable on my esp8266
@lumapu can I get you any info? I have logged heap fragmentation and all other things mqtt gets in influxdb
To dig deeper into the problem please answer the following questions:
To dig deeper into the problem please answer the following questions:
do you have an capacitor right next to the NRF module? +Yes. As close as possible
do you soldered or pinned the connections? +Soldered
how is the setting of you power-level? +Min
which interval do you set? +30s
do you use power-limit control? +No
have you connected and configured a display? +No
Upgraded to 0.7.26 in the evening (no sun) Today in the morning device didn't ping anymore, had to disconnect power Could you check 0.7.5 ? It's the most stable version for me on the ESP8266
Now with 0.7.5: flashed at 23:50 worked until 6:30 now serial says:
I: MQTT disconnected, reason: TCP disconnect
I: MQTT disconnected, reason: TCP disconnect
I: MQTT disconnected, reason: TCP disconnect
I: MQTT disconnected, reason: TCP disconnect
I: MQTT disconnected, reason: TCP disconnect
I: (#0) enqueued cmd failed/timeout
I: (#0) resetPayload
I: (#0) Requesting Inv SN xxxxxxxxxx
I: (#0) enqueCommand: 0x0B
I: (#0) prepareDevInformCmd 0x0B
15 pid: 80
I: TX 27B Ch3 | xx xx xx xx xx xx 14 68 33 80 0B 00 64 E3 0B BF 00 00 00 01 00 00 00 00 10 B0 C3
I: (#0) nothing received
I: MQTT disconnected, reason: TCP disconnect
I: MQTT disconnected, reason: TCP disconnect
I: MQTT disconnected, reason: TCP disconnect
I: MQTT disconnected, reason: TCP disconnect
I: MQTT disconnected, reason: TCP disconnect
I: (#0) enqueued cmd failed/timeout
I: (#0) resetPayload
I: (#0) Requesting Inv SN 114181805153
I: (#0) enqueCommand: 0x0B
I: (#0) prepareDevInformCmd 0x0B
15 pid: 80
I: TX 27B Ch23 | xx xx xx xx xx xx 14 68 33 80 0B 00 64 E3 0B DD 00 00 00 01 00 00 00 00 72 01 72
I: (#0) nothing received
I: MQTT disconnected, reason: TCP disconnect
I: MQTT disconnected, reason: TCP disconnect
I: MQTT disconnected, reason: TCP disconnect
I: MQTT disconnected, reason: TCP disconnect
I: MQTT disconnected, reason: TCP disconnect
so maybe it lost wifi but it didnt try to reconnect. i moved it without rebooting near the router, but no reconnect.
mqtt server is up, wifi is up, checked that.
also it doesnt get a response from the inverter anymore (when i'm reading the log right).
after pressing reset button, every thing works again. inverter responds immediately
you're still in a ESP8266? Have you also tried the most recent development versions? How behave the DTU if MqTT isn't configured?
Yes, still on esp8266. I resoldered some connections to the nrf24, so now communication to the inverter is back.
But: This is how uptime looks like.with version 0.7.5
At least it doesn't get stuck like with the stable version.0.7.26
Heap and Rssi
Disabled mqtt, now up with no issues.
Is MQTT improved in newer versions?
Now since 5 days with 0.7.40 Hardware watchdog reboots it several times during the day. Sometimes TCP Stack/ wifi connection gets stuck, no pings anymore. I ticked the "reboot at midnight" option so it comes back at midnight.
Wouldn't it be possible to reboot on lost WLAN or lost MQTT?
I am back to version 0.6.0 now which is stable since a week now.
Is there any chance to get a stable 0.7 version for esp8266?
my ESP8266 runs stable with each version - currently I'm on 0.7.50
. MqTT is enabled and one inverter is registered.
This is great for you but is not helping... My esp8266 runs now straight for a week without any issues on version 0.6.0 Since 0.6.9 something must have changed that's making it unstable. Maybe your WiFi or MQTT connection is more stable than mine but it shouldn't be a problem for a dtu.
can you check once you upgrade Ahoy that the heap-fragmentation is low (0-10)? For me after upgrading with OTA the heap is around 25. Then I need to reboot the ESP again using the reboot button in WebUI. After that I have a heap-fragmentation around 2. Hope that helps better than my last answer. Very interesting is as well to have a more stable hardware as base, check #1083 for that. My ESP8266 is driven by an external DCDC (5V to 3.3V) power supply and not the on-board-regulator.
I am using the current stable now: GIT SHA: ba218ed :: 0.7.36
I checked heap free and fraq Fraq is low as 2 most of the time Free is 17.000
It does resets 2-3 times a day. This wouldn't be a problem as it comes up most of the time. But sometimes it stops working with no reset. Then it doesn't ping anymore or send MQTT messages.
Now I tried to improve WLAN signal. It was -73dB before. Now it's around -60dBm.
Maybe that gives us a hint.
So long S
Okay, dtu is offline since 16:15. No ping. Does it have to do with sunset? Tomorrow I'll try to disable night mode at sunset.
Platform
ESP8266
Assembly
I did the assebly by myself
nRF24L01+ Module
No response
Antenna
circuit board
Power Stabilization
Elko (~100uF)
Connection picture
Version
0.6.9
Github Hash
230419_ahoy_0.6.9_15ec6a0_esp8266.bin
Build & Flash Method
AhoyDTU Webinstaller
Setup
Nothing special. No display.
Debug Serial Log output
No response
Error description
Had stable 0.6.0 experience. Updated to 0.6.9 WiFi connection unstable Ping test shows 5-6 pings after reboot then 20 lost pings, then reboot, then some pings and so on
When I touch the antenna or the cover of the esp the ping rate is stable. This way I was able to downgrade to 0.6.0
Very strange behavior.