Closed Steffeng5 closed 3 years ago
Same here for my esp8266 with SDS011 and DHT22: After firmware update yesterday, there are aprox 10 readings per hour. web interface is mostly unreachable. Stats for the last 15 hours: measurement count: 182 wifi error count: 78 sensor.comunity error count: 192 (!) sds011 error count 78.
The firmware from jan 2020 was quite stable and was running for month without any problems.
Although I don't have any problems with my sensor I've asked a few colleagues running a similar sensor about their experiences. 2 confirmed OK (no Wifi errors, responsive UI, a few (normal) upload errors). 2 reported 'no problem/no measurement gaps'. But I got one report that looks similar to this one, almost impossible to get to the WebUI and many measurement gaps. He was able to get a snapshot of the /status page and the /values page:
Note that almost all measurements failed to upload and that in a period of 42 minutes 15 NTP syncs were done (normally once per hour). Number of errors for Sensor.Community is twice as much but that is normal since data is uploaded separately per sensor.
@holgerbohni : do you also see a high number of NTP syncs, comparable to the number of Wifi errors?
The person with these stats also had this on the /values page:
WiFi Signaalsterkte 31 dBm
WiFi Signaalkwaliteit 0 %
I've seen this only once during my tests and that most likely was just after a single Wifi-error (the first since the upgrade). I guess 31 is reported if no Wifi-signal was registered.
The positive dBm value of 31 is also stored in the json-data sent to Madavi.de and although most uploads fail some of them get through with a signal:31 value since the upgrade. @ricki-z: this might be something to look for on Madavi. I somehow think that in this case before almost every measurement the wifi connection is initialised again including a NTP sync.
The person having these problems has one thing that might be worth mentioning: the sensor is connected to a Fritz!box Mesh network with a Fritz!box 7583 and a Fritz!WLAN repeater 1750E . The other reporters don't have a Mesh network.
@Steffeng5 @Phaze-III can you please check with the firmware from https://static.dmllr.de/airrohr/beta/builds-NRZ-2020-132-B1-wifirevert/ ?
(remember to turn of OTA update in config, otherwise it will be immediately reverted)
Sorry, no ntp sync count 'cause my device is completely offline now and it takes some time to get it back. (it's a little bit tricky to reach it :) I will check Dirks Firmware later today. Thank you for your support! @Steffeng5 @Phaze-III @dirkmueller
Here we go: Dirks Firmware NRZ-2020-132-B1/DE is running.
Thanks again.
@Steffeng5 @Phaze-III @dirkmueller
Currently I have the same status like @holgerbohni
Wifi works good. Indoor no problems. Since 40 minutes outdoor again( with quite bad wifi signal (26% -87dBm, but it's okay, UI is smooth, no errors right now))
Datapoints at the moment are in a stable interval. Will monitor it the next hours and come back with more results.
@dirkmueller first impression after 45 minutes is good, very responsive UI, no errors, no gaps.
FYI: overnight I have been running a build with some of the B10/11-patches reverted (removed the __noinline additions and re-inserted the yield(). That was also very stable (no errors, no gaps) and had a higher sample rate (31K/s instead of 25K/s).
Not so amazing at the moment :-/
There are also "holes" in the grafana data again.
Results after 3 hours: The slightly changed position gives a noticeable better reception (+ 3dBm) which seems essential: No errors at all (4 NTP Syncs BTW)
Looks like signal strength above -80 dBm is important.
Results after 6 hours: Looks good right now. No errors since last screenshot and moving the wifi router 5 cm near to esp :-D Also my 2 "new" esps are working good with the beta version.
Which changes where made in the current stable version regarding the wifi connection? @dirkmueller
I've had the first 6 errors in a row after 6 hours. Last signal reading was -80dBm, then a couple of wifi and sensor errors occoured. When it starts to rain attenuation increases, which normaly doesn't lead to any problems. At the moments it still looks a little bit wonky.
@Steffeng5 when "gain" means centimeter... :-)
Which changes where made in the current stable version regarding the wifi connection?
It has https://github.com/esp8266/Arduino/pull/7486/files reverted, aka the "forcefully disconnect" on authmode change.
@Steffeng5 am I correct in assuming that you have multiple wifi access points in this essid setup?
@holgerbohni do you have more than one AP (e.g. a wifi range extender/mesh node etc)?
@Phaze-III so reverting the patches increases the sample rate? previously you said those patches helped increasing the sample rate.. This is weird.
so I'm still stunned how we have wifi issues given that the sdk version (which carries the wifi code) did not change between previous and current stable.
@dirkmueller nope, only a single 2,4GHz AP with a good antenna. Current status:
@dirkmueller
so reverting the patches increases the sample rate? previously you said those patches helped increasing the sample rate.. This is weird.
The earlier reported increase in sample rate was comparing a build of NRZ-2020-130-B9 with only -DFP_IN_IROM with a build of NRZ-2020-130-B11 (at 03fee73302566210dfba7869073aaba8d42c5d99).
After upgrading to NRZ-2020-131 I noticed a decrease in sample rate which I thought was weird but backtracking was caused by 8aa2f465bb574d1e8d5a4f2a54a84d6681967eb8 where one __noinline was added:
-static void fetchSensorPPD(String& s) {
+static __noinline void fetchSensorPPD(String& s) {
So I tried a build with all 4 __noinline's removed which increased the sample rate again.
Yesterday I looked at other changes between B9 and B11 and two (not GPS-related) came up:
@@ -4312,9 +4300,6 @@ void loop(void) {
}
sample_count++;
-#if defined(ESP8266)
- ESP.wdtFeed();
-#endif
if (last_micro != 0) {
unsigned long diff_micro = act_micro - last_micro;
UPDATE_MIN_MAX(min_micro, max_micro, diff_micro);
@@ -4531,7 +4517,6 @@ void loop(void) {
starttime = millis(); // store the start time
count_sends++;
}
- yield();
#if defined(ESP8266)
MDNS.update();
serialSDS.perform_work();
Putting the yield();
line back slightly decreased the sample/rate again but still higher than NRZ-2020-131. I haven't tested the effect of ESP.wdtFeed();
.
yield()
might be a candidate given a comment I found in https://github.com/opendata-stuttgart/sensors-software/pull/28#issuecomment-270099925 .
So I tried a build with all 4 __noinline's removed which increased the sample rate again.
This does not match my experience. there is a 50% samples counter improvement for me with __noinline's .
I haven't tested the effect of ESP.wdtFeed();
wdtFeed() feeds the watchdog timer, which resets the node if it isn't regularly called (at least every 2s). the watchdog is reset when loop() exits so that is very unlikely to be any problem.
Putting the yield(); line back slightly decreased the sample/rate again
Right, yield() is a way to pass on control to the wifi stack when loop() doesn't exit soon enough. if you check this is very much near the end of loop() so I think we can give that up.
from my measurements, there are dozen other places where we spend 10-50 times the time before exiting loop (like for example in the webserver stack that returns larger webpages). so those would have to be solved first.
@Steffeng5 please ensure that serial debug is turned off (debug level 0 or 1 in /config).
Does that help with the wifi instability?
okay, everyone, we need to move forward in isolating the issues instead of having confusing side discussions. so I made a couple of firmwares to try. I would like to invite everyone to try out each one of them and report which one works best.
Here are the options:
https://static.dmllr.de/airrohr/beta/builds-NRZ-2020-132-B1-multiwifi/ This one will use a different wifi reconnection algorithm and should help for cases when there is more than one access point and the wifi stack for some reason picked the one with the worst reception. it will scan on connection loss (so not on boot, but on the first connection loss after boot) for the BSSID with the best RSSI (e.g. best reception quality) and hardcode itself to that one, until next connection cutout happens. This is the one that I favor the most.
https://static.dmllr.de/airrohr/beta/builds-NRZ-2020-132-B1-new-SDK/ this is a rebuild of the unmodified firmware against the newest 2.2.x NON_OS SDK from espressif. we did not change the SDK version before, however if there are indeed wifi stack issues then updating the wifi stack might magically fix something. Based on other communities feedback, the newer version is bad especially in low wifi-receiption quality situations so most likely that won't be an improvement, but your feedback is appreciated
https://static.dmllr.de/airrohr/beta/builds-NRZ-2020-132-B1-oldarduino/ this is a rebuild of the unmodified firmware against the previously used (in NRZ-2020-130) arduino core. this should help us pinpoint if the issue is related to modifications in the firmware or the core version bump in any way.
https://static.dmllr.de/airrohr/beta/builds-NRZ-2020-132-B1-wifirevert/ as previously mentioned, this is unmodified firmware with the wpa-downgrade-security fix reverted. I would not be able to understand why that would make any difference, but if it does, then that would be a good datapoint to know.
In addition, other than trying out custom firmwares, these options can be tried:
if any of those drastically improve the situation we know where to start looking. I have no ability to reproduce this. I run a sensor with SDS011, DHT11 and BME280 connected and it works fine. I'm trying iwth two wifi access points. I am currently not able to simulate low-reception situations, but the weather season (dry/humid air) certainly currently is not to our favor for outdoor sensor. reception quality is known to be worse in humid, foggy air situations.
Okay, I think I found something. Let me know if https://static.dmllr.de/airrohr/beta/builds-NRZ-2020-132-B1-sds011rework/ works.
The issue is that the newer arduinocore has a different espsoftwareserial blocking read behavior that we didn't expect.
Ok, you‘re faster than me and my tests. I‘m running different versions on two devices since yesterday. I will continue with your latest Firmware today. Results so far: multiwifi and wifireverted are both stable without wifi errors, but with a couple of sds011 errors over 12 hours.
@dirkmueller no errors after 4 hours with „132-B1-sds011rework“ on any of my devices. Looks very good. Update: 9h without errors! 👍 Update: 13h, 2 devices, 0 errors :-)
@dirkmueller : NRZ-2020-132-B1-sds011rework has been running very stable for more than 6 hours on the sensor of the person with the Fritx!Box Mesh network, only 1 WiFi-error. In his setup NRZ-2020-131 and the other 4 trial-builds didn't work. So you definitely found something 👍
of the person with the Fritx!Box Mesh network, only 1 WiFi-error.
so the wifi disconnect was WIFI_DISCONNECT_REASON_ASSOC_LEAVE, aka the wifi mesh reconnected the client to a different endpoint. so thats not really an "error", just normal behavior.
There is however an issue visible in this screenshot, the "SDS011" version colum is empty. so it failed to read the information from the node on boot. also there are two SDS011 errors. which isn't a lot, but it is more than I'd like.
None of the changes that I did should however affect that.
still interested in more feedback from others potentially.
There is however an issue visible in this screenshot, the "SDS011" version column is empty.
A restart fixed that:
Other screenshots I received from him also showed the version string so this was most likely a glitch.
I will try the Version tomorrow and give you feedback!
The changes landed in B2, which is now online in the beta channel. You can also do an 'use beta' ota instead.
Please report if there are issues remaining, otherwise I assume it's fixed.
@dirkmueller thank you so much! BTW: no errors over 48h.
@dirkmueller Thanks! That makes it much easier for me, that I do not have to uninstall the airrohr again to flash it :-) Just updated to beta channel.
Just visualized the sample rate calculated down per second (group by 15m / 15 / 60) and labeled it by version:
Now waiting some hours for the current beta to get results
@dirkmueller
also there are two SDS011 errors. which isn't a lot, but it is more than I'd like.
For the record: after the reboot the sensor attached to the Fritz!box Mesh has been running fine for two days. A snapshot of the status-page after 21 hours:
Only one WiFi error and 1 SDS011 error.
@dirkmueller on my devices your „sds011rework“ firmware performen 600 measurements without an error.
@holgerbohni : I can confirm the lower sample rate of 132-B2 (OTA) compared to the sds011rework build. Same numbers on my own sensor. However I didn't see any Wifi-errors after the first time I did an OTA upgrade to 132-B2 (test duration ~ 10 hours).
This evening I did a second OTA upgrade and after that one I got a relatively unresponsive UI and indeed a few Wifi-errors (4 in 1 hour). After a soft reset the UI was responsive again and up until now no Wifi-errors. You might want to try a reset.
@Phaze-III after soft reset the error rate increases. After 30min
Wifi: 1/-85/200 Sds011: 1
After power cycling almost same situation: wifi and sds011 errors are back.
Update after 3 h (screenshot)
The results of a comparison of Dirk's sds011rework and the OTA build of NRZ-2020-132-B2 on the sensor attached to the Fritx!box Mesh also shows a big difference . The OTA build performs as bad as the NRZ-2020-131 release build while the sds011rework build has been running happily for at least three days:
builds-NRZ-2020-132-B1-sds011rework/latest_nl.bin Sample rate 35K/s
OTA NRZ-2020-132-B2/NL Sample rate 25 K/s
So the question is what was different in Dirk's sds011rework build?
I compiled two tables with the test results and parameters that appear to have some influence for my own sensor and the Fritz!Box Mesh sensor.
Included in the table are the results of tests to see whether the sds011rework patch would have helped in the bad performance of NRZ-2020-130-B9/B10 for my sensor (see #789). It appears that that wouldn't have made a difference :-( In that case the addition of -DFP_IN_IROM in B11 made the difference for my sensor.
Next set of tests was to build NRZ-2020-131 and NRZ-2020-132-B2 without -DFP_IN_IROM and have them tested on the problematic Fritzbox!Mesh sensor. That resulted in a good performance for NRZ-2020-132-B2 (no errors and responsive UI) and a slight improvement for NRZ-2020-131 (no Wifi errors but unresponsive UI). So for NRZ-2020-131 and NRZ-2020-132-B2 moving the FP-routines back to IRAM seems to help.
Another set of tests was to see how firmware built with the Arduino IDE would perform (with settings as close to the PlatformIO settings as possible, see https://github.com/opendata-stuttgart/sensors-software/issues/789#issuecomment-703146144). On both my sensor and the Fritzbox!Mesh sensor that appears to give good/better performance for versions that perform badly when built with PlatformIO. Now this is just an observation, I really don't want to start a discussion about which IDE to use. But a cursory look at the build-logs indicates that there are differences in how the various object (.o) and archive (.a) files are built and linked together. It might be an idea to do some research on what those differences are and to check if there are options to tune the PlatformIO builds.
As said, I just want to report my observations hoping that they may help in finding either the cause or a workaround for the instability problems.
Columns used:
Build: Name/Version of the build Source: online (fetched from firmware.sensor.community), OTA (installed via OTA), dmllr.de (test build from Dirk) or local (built by me) IDE: IDE used, PlatformIO or Arduino IDE FP_IN_IROM: whether defined or not in platformio.ini Duration: duration of the test period reported in the table UI: responsiveness of the UI, subjective indication where normal means occasional delays and good means almost always immediate response Sample rate: given in K/s Git Hash: ref to the 'HEAD' of the checkout the build was made with
Build | Source | IDE | FP_IN_IROM | Duration | UI | Sample rate (K/s) | Wifi Errors | SDS011 Errors | Remarks | Git Hash |
---|---|---|---|---|---|---|---|---|---|---|
NRZ-2020-129 | online | pio | no | various | normal | 30 | 0 | a few during server problem periods | 8a936e1 | |
NRZ-2020-130-B9/DE | OTA | pio | no | 1 hour | unusable | 15 | 5 | 0 | f62c962 | |
NRZ-2020-130-B10/DE | OTA | pio | no | 6 hours | bad | 19 | 10 | 0 | 04d54e3 | |
builds/NRZ-2020-130-B10-fp-in-rom/latest_de.bin | dmllr.de | pio | yes | 10 hours | good | 35 | 0 | 0 | ||
NRZ-2020-130-B11/DE (14 Oct 2020) | OTA | pio | yes | 12 hours | good | 35 | 0 | 0 | 14 Oct build | 418c866 |
NRZ-2020-131/DE | OTA | pio | yes | 12 hours | normal | 25 | 0 | 0 | One additional no_inline | b78aa0a |
builds-NRZ-2020-132-B1-sds011rework/latest_nl.bin | dmllr.de | pio | ??? | 14 hours | good | 35 | 1 | a few during server problem periods | ||
NRZ-2020-132-B2/NL | OTA | pio | yes | 10 hours | normal | 25 | 0 | a few during server problem periods | 5adb530 | |
FP_IN_IROM test | ||||||||||
NRZ-2020-132-B2-no-fp in irom-8c9e540/NL | local | pio | no | 22 hours | good | 35 | 0 | 2 during server problem periods | Phaze-III/sensors-software@8c9e540 | |
NRZ-2020-131-no-fp_in_irom-8c28e90/NL | local | pio | no | 6+ hours | normal | 35 | 0 | 0 | Phaze-III/sensors-software@8c28e90 | |
sds011rework backport | ||||||||||
NRZ-2020-130-B9-sds011rework_nl.bin | local | pio | no | 1 hour | unusable | 12.5 | ? | ? | Phaze-III/sensors-software@dc8f6a3 | |
NRZ-2020-130-B10-sds011rework_nl.bin | local | pio | no | 5 hours | normal | 25 | 0 | Phaze-III/sensors-software@dfc2310 | ||
Arduino IDE | ||||||||||
NRZ-2020-130-B9_de.bin | local | Arduino | - | 18 hours | normal | 18 | 0 | f62c962 | ||
NRZ-2020-131_nl.bin | local | Arduino | - | 8 hours | good | 35 | 0 | b78aa0a | ||
NRZ-2020-132-B2-5adb530.bin | local | Arduino | - | 3 hours | good | 35 | 0 | 5adb530 |
Build | Source | IDE | FP_IN_IROM | Duration | UI | Sample rate | Wifi Errors | SDS011 Errors | Remarks | Git Hash |
---|---|---|---|---|---|---|---|---|---|---|
NRZ-2020-131/NL | OTA | pio | yes | 42 minutes | unusable | ? | 16 | 8 | b78aa0a | |
builds-NRZ-2020-132-B1-sds011rework/latest_nl.bin | dmllr.de | pio | ??? | 3 days | good | 35 | 4 | 6 | ||
NRZ-2020-132-B2/NL | OTA | pio | yes | ~9 hours | unusable | 25 | 215 | 0 | Including power cycle | 5adb530 |
NRZ-2020-132-B2-no-fp in irom-8c9e540/NL | local | pio | no | 4 hours | good | 35 | 0 | 0 | Phaze-III/sensors-software@8c9e540 | |
NRZ-2020-131/NL | local | Arduino | - | 11 hours | good | ? | 1 | 0 | b78aa0a | |
NRZ-2020-131-no-fp_in_irom-8c28e90/NL | local | pio | no | 5+ hours | timeouts | 35 | 0 | 1 | Phaze-III/sensors-software@8c28e90 |
@Phaze-III thanks for the exhaustive testing. I can spend some time on making sure that the builds become more reproducible, however that will take some time to upstream.
There is a different measurement that might be more telling telling : the max_micros one.
Now, with the B2 build, is there anything in the sensors list to disable that then avoid the issue? Like for example disabling sds011 in config?
Also could you please give the 4 test build it firmwares from me a try in your mesh environment? I think that would be also very helpful.
@holgerbohni just to be sure I understand this correctly, the sds011rework firmware works after you reinstall it but the -B2 build does not?
@dirkmueller yes, that’s correct. B2 produces a lot of sds011-errors, fewer wifi-errors and after approx 20 hours it reboots with reason „hardware watchdog“. I‘ve never seen this before. „sds011rework“ had no errors at all.
@dirkmueller : I've updated the results for the problematic Fritz!Box Mesh sensor in the table below. The 4 firmwares were already tested but not included in the table (see https://github.com/opendata-stuttgart/sensors-software/issues/814#issuecomment-714561555), fixed that.
I also asked the owner to do a test of the OTA version of B2 with all sensors disabled, debug level set and saved to 0 and only one API (Madavi.de) enabled. That didn't improve things, still a non-responsive sensor with only Wifi-errors. At the end of the test period the owner could only get to the UI after a lot of F5 tries.
There's also another additional test the owner did overnight: flashing a saved copy of the Oct 14 2020 online build of NRZ-2020-130-B11 . That one ran for at least 12 hours without errors!
My conclusion at the moment, without trying to pinpoint a root cause, is that there is some weird interaction between even the slightest code change (except for perhaps string constants like version and language strings) and the build process. So the minimal code difference (just one line and whitespace) between a stable B11 (Oct 14 2020 build) and NRZ-2020-131 resulted in a binary that doesn't work on 'problematic' sensors.
My suggestion would then be that a point patch on NRZ-2020-131 to get the same code as 130-B11 would give you working firmware in the stable channel for the problematic sensors. The owner of the problematic Fritz!box Mesh sensor is now running a build with the patch below, until now very stable.
That doesn't help in finding the cause but might help in getting 'problematic' sensors back online.
diff --git a/airrohr-firmware/airrohr-firmware.ino b/airrohr-firmware/airrohr-firmware.ino
index f3061ff..07f9fa5 100644
--- a/airrohr-firmware/airrohr-firmware.ino
+++ b/airrohr-firmware/airrohr-firmware.ino
@@ -60,7 +60,7 @@
#include <pgmspace.h>
// increment on change
-#define SOFTWARE_VERSION_STR "NRZ-2020-131"
+#define SOFTWARE_VERSION_STR "NRZ-2020-131-P1"
String SOFTWARE_VERSION(SOFTWARE_VERSION_STR);
/*****************************************************************
@@ -3097,7 +3097,7 @@ static void fetchSensorNPM(String& s) {
/*****************************************************************
* read PPD42NS sensor values *
*****************************************************************/
-static __noinline void fetchSensorPPD(String& s) {
+static void fetchSensorPPD(String& s) {
debug_outln_verbose(FPSTR(DBG_TXT_START_READING), FPSTR(SENSORS_PPD42NS));
if (msSince(starttime) <= SAMPLETIME_MS) {
@@ -3250,7 +3250,6 @@ static void fetchSensorDNMS(String& s) {
debug_outln_info(FPSTR(DBG_TXT_SEP));
debug_outln_verbose(FPSTR(DBG_TXT_END_READING), FPSTR(SENSORS_DNMS));
}
-
/*****************************************************************
* read GPS sensor values *
*****************************************************************/
firmware.sensor.community builds | Source | IDE | FP_IN_IROM | Duration | UI | Sample rate | Wifi Errors | SDS011 Errors | Remarks | Git Hash |
---|---|---|---|---|---|---|---|---|---|---|
NRZ-2020-130-B11/DE (14 Oct 2020) | online | pio | yes | 12 hours | good | 35 | 0 | 0 | 418c866 | |
NRZ-2020-131/NL | OTA | pio | yes | 42 minutes | unusable | ? | 16 | 8 | b78aa0a | |
NRZ-2020-132-B2/NL | OTA | pio | yes | ~9 hours | unusable | 25 | 215 | 0 | Including power cycle | 5adb530 |
NRZ-2020-132-B2/NL (no sensors, 1 API, lvl=0) | OTA | pio | yes | 2 hours | unusable | 25 | 49/-75/8 | - | Including power cycle | 5adb530 |
dmllr builds | ||||||||||
builds-NRZ-2020-132-B1-wifirevert/latest_nl.bin | dmllr.de | pio | ??? | 35 minutes | unusable | ? | 13 | 5 | ||
builds-NRZ-2020-132-B1-multiwifi/latest_nl.bin | dmllr.de | pio | ??? | 3 hours | unusable | ? | - | - | ||
builds-NRZ-2020-132-B1-new-SDK/latest_nl.bin | dmllr.de | pio | ??? | 1 hour | unusable | ? | - | - | ||
builds-NRZ-2020-132-B1-oldarduinolatest_nl.bin | dmllr.de | pio | ??? | 1 hour | unusable | ? | - | - | ||
builds-NRZ-2020-132-B1-sds011rework/latest_nl.bin | dmllr.de | pio | ??? | 3 days | normal/good | 35 | 4 | 6 | ||
local builds | ||||||||||
NRZ-2020-131/NL | local | Arduino | - | 11 hours | good | ? | 1 | 0 | b78aa0a | |
NRZ-2020-131-no-fp_in_irom-8c28e90/NL | local | pio | no | 5+ hours | timeouts | 35 | 0 | 1 | Phaze-III/sensors-software@8c28e90 | |
NRZ-2020-132-B2-no-fp in irom-8c9e540/NL | local | pio | no | 4 hours | good | 35 | 0 | 0 | Phaze-III/sensors-software@8c9e540 |
@Phaze-III Can you check the firmware versions installed on your Fritz!Box mesh and which WPA encryption is active? There were updates for most AVM devices in October which activated WPA3 support.
@ricki-z All tests were done with: Fritz!Box 7583 running FritzOS 7.15 Fritz!WLAN 1750E repeater running FritzOS 7.20
On the Fritz!box only WPA2(CCMP) is enabled (no WPA, no WPA3 available in the settings)
As of 17:45 today the Fritz!Box 7583 went from 7.15 to 7.21, WPA3 available in the settings but disabled.
Is WPA3 also disabled on the Repeater? And to which AP the sensors should connect to?
In the mesh the repeater is just a clone of the 'mesh master' and all settings from the master (SSID, channel, encryption etc) are identical and not configurable on the repeater. The sensor is located in a shed outside approx. 8 meter from the mesh master AP. The repeater is located on the attic with a few concrete walls and floors between repeater and sensor so practically out of range. The sensor is therefore always connecting to the mesh master. Signal strength is stable between -75 and -70dBm/50% and 60% usually.
The owner of the 'problematic' Fritz!box Mesh sensor has been running the 'point patched' version of NRZ-2020-131 now for more than 3 days without any problems. Only 4 wifi-errors in three days, two of them while the Fritz!Box AP was being upgraded. WebUI always very responsive. Screenshot below.
Note that that build was made with a clean checkout of the master branch with only the point patch applied and using an unmodified platformio 5.0.1 build environment (same results with 5.0.2 BTW).
Using that environment I get exactly the same binaries as those on firmware.sensor.community. Size and MD5 checksum are identical when I set the system date to the date of the binary on firmware.sensor.community before building.
@dirkmueller
There is a different measurement that might be more telling telling : the max_micros one.
I see the occasional large spikes there on my sensor but I can't yet correlate them to specific problems.
Would it be possible to put a graph of max_micros on the Madavi/Sensor.community api-rrd grafana dashboard?
I can check the max_micros on my own sensor with in my local influxdb but the owner of the Fritz!box-sensor has no local API, only the Madavi/Sensor.Community ones.
The max_micro/min_micro values are now shown on the page with the wifi signal quality at api-rrd.madavi.de.
Update: the OTA version of NRZ-2020-132-B3 has been running very stable for a few days now on my sensor with a sample rate of ~35K, no wifi-errors and no max_micro spikes.
I've also asked the owner of the 'problematic' Fritz!box Mesh sensor to do the OTA upgrade to NRZ-2020-132-B3 and he also reports a stable sensor, no gaps in the measurements, responsive UI.
So this particular codebase and build appears to produce a stable firmware-binary that might give good results in other difficult environments.
On a side note, it looks like the sensor works when the Fritz!box Mesh has WPA3 enabled. We did a short test with a stable locally patched version of NRZ-2020-132 and WPA2+WPA3 enabled and the sensor still worked, also after a reset (connecting with WPA2). So given a stable version of the firmware enabling both WPA2 and WPA3 on the network should work.
Status after 2 days on the Fritz!box Mesh sensor is still looking good. A few Wifi errors which is to be expected given the rather low signal quality but overall very stable.
Same issue here. Just tried 25 times to set a new device up. Device not connect to router. After reboot device needs to be configured again and again and again... really frustrating. Is my first sensor for air. Is there any chance to get an older firmware? And is it possible to set up a static ip?
I wanted to setup some new sensors yesterday and saw, that after configuring my SSID/Password via AP Mode the Airrohr always was rebooting in AP Mode and not connecting to my wifi.
Looking at another "older" sensor, I saw there is a new firmware published und I was not able to connect to 1/2 of my Airrohr-ESPs anymore. The one I cannot connect is sending data, but since the update not so frequently than before and it is not accessible via http.