nettigo / namf

Nettigo Air Monitor Firmware
GNU General Public License v3.0
33 stars 19 forks source link

SDS011 reply checksum failed and freezing #69

Open ggocsei opened 9 months ago

ggocsei commented 9 months ago

After a year flawless working, I'm sterted to see SDS011 reply checksum failed in the log. I don't know if there any correlation, but lately the device is started to freeze after one-two days. After power off-on it's working as usual (with the checksum failed) and then freeze again in the one-two days period. I've no other errors in the log. Send to API Sensor Community , Send diagnostic data to Nettigo settings are enabled. The Auto update firmware, using channe also enabled with the stable channel. How can I make it really stable again?

danielskowronski commented 7 months ago

Same here. However, I was getting this bug randomly over the last couple of years - I got v2 sensor back from times before NAMF fork existed. A few days ago I tried NAMF-2020-45 after using NRZ-2020-133 since it was released, and it hit the same issue. Power supply is not at fault here, the device can draw 300-400mA and not much more because it doesn't negotiate any higher-current charging protocol.


From the only log on current stable firmware I have, it seems like after crash, the device does not re-initialize SDS011 and it always errors out with "SDS011 reply checksum failed" until you power it off and on again. However, I didn't find proper timing for that (I'm probably way too impatient with some capacitors or the code is buggy) and the only solution for me was to enter AP mode, re-enter Wi-Fi password and it just started to work.

The device is connected to the PC to gather debug data shows that it restarts internally, but claims it's unable to get IP so fails to do anything. On the other hand, Wi-Fi router claims device connects to the physical layer and has issued IP address from DHCP.

Here's that log, unfortunately it was on level=3 so not much details:

2023-12-04 10:29:27: SDS011: end of cycle
2023-12-04 10:29:33: Creating data string:
2023-12-04 10:29:33: WLAN signal strength: -90 dBm
2023-12-04 10:29:33: ----
2023-12-04 10:29:46: Succeeded - 
2023-12-04 10:29:47: Succeeded - 
2023-12-04 10:29:47: ## Sending to madavi.de: 
2023-12-04 10:29:47: Succeeded - 
{d.lśž|.„lŕ|....Ś.lě.c|‡‚.ě.“s“c„.c„űooźlggśăä.c.x„Źdrlslpűoŕ....ƒ.l.Śś...c.nă|.ädŹ.‡cŚűggď.lŚŽl .˜..gg.d`...o{Ź››o..b.l`..s““n..c.d`.śc..‡..’ls“`.üƒoś.{d.dśź|.„lŕ|....Ś.lě.c|‡ƒ.ä.“{›c„.cŚűogźlggśăä.b.p„Źdslslpűoŕ....‚.l.Śś...b.oă|.äl‡.‡cŚňggď.l„Źd`.˜..nn.d`...o{Ź›Űg..b.l`..s““n..c.l`.ś#..‡..“ls“`.üƒgś.1970-01-01 00:00:00: 
NAMF ver: NAMF-2020-45/EN
1970-01-01 00:00:00: Chip ID: 1234567
1970-01-01 00:00:00: SPIFFS (kB): 888334
1970-01-01 00:00:00: Free sketch space (kB): 1416
1970-01-01 00:00:00: CPU freq (MHz): 160
1970-01-01 00:00:00: Set defaults
1970-01-01 00:00:00: mounting FS: OK
1970-01-01 00:00:00: SDS011: start
1970-01-01 00:00:00:  [HECA ERROR] Cannot start periodic mode
1970-01-01 00:00:00: Trying BME280 sensor on 
1970-01-01 00:00:00: 76 ... found
1970-01-01 00:00:01: BME280: start
1970-01-01 00:00:01: Config parsed
1970-01-01 00:00:01: PowerOnTest
1970-01-01 00:00:01: Starting Webserver... (IP unset)
1970-01-01 00:00:01: output debug text to displays...
1970-01-01 00:00:01: SSID: 'XXXXXXXXXX'
1970-01-01 00:00:01: 6
1970-01-01 00:00:01: ........................................
1970-01-01 00:00:22: output debug text to displays...
1970-01-01 00:00:22: Failed to connect to WiFi. Trying to connect to fallback WiFi
1970-01-01 00:00:22: XXXXXXXXXX_IoT
1970-01-01 00:00:22: ..........SDS011 reply checksum failed 
1970-01-01 00:00:27: ..............................output debug text to displays...
1970-01-01 00:00:42: Starting WiFiManager
1970-01-01 00:00:42: AP ID: NAM-1234567
1970-01-01 00:00:42: Password: 
1970-01-01 00:00:42: scan for wifi networks...
1970-01-01 00:00:44: Starting AP with default password
1970-01-01 00:01:17: Connecting to XXXXXXXXXX
1970-01-01 00:01:17: ---- Result Webconfig ----
1970-01-01 00:01:17: WLANSSID: XXXXXXXXXX
1970-01-01 00:01:17: ----
Reading ...
1970-01-01 00:01:17: SDS: 0
1970-01-01 00:01:17: PMS: 0
1970-01-01 00:01:17: DHT: 0
1970-01-01 00:01:17: DS18B20: 0
1970-01-01 00:01:17: ----
Send to ...
1970-01-01 00:01:17: Dusti: 1
1970-01-01 00:01:17: Madavi: 1
1970-01-01 00:01:17: CSV: 0
1970-01-01 00:01:17: ----
1970-01-01 00:01:17: Autoupdate: 0
1970-01-01 00:01:17: Display: 0
1970-01-01 00:01:17: LCD 1602: 0
1970-01-01 00:01:17: Debug: 3
1970-01-01 00:01:17: ....SDS011 reply checksum failed 
1970-01-01 00:01:20: ................
1970-01-01 00:01:27: WiFi connected
IP address: (IP unset)
1970-01-01 00:01:27: Setting time using SNTP
1970-01-01 00:01:27: Thu Jan  1 00:01:27 1970

1970-01-01 00:01:27: 0.europe.pool.ntp.org
1970-01-01 00:01:28: ....................
router/gateway:
1970-01-01 00:01:38: ....................NTP time not received
1970-01-01 00:01:47: Send to :
1970-01-01 00:01:47: luftdaten.info
1970-01-01 00:01:47: Madavi.de
1970-01-01 00:01:47: 
1970-01-01 00:01:47: 
mDNS failure!
1970-01-01 00:04:07: SDS011: end of cycle
1970-01-01 00:04:12: Creating data string:
1970-01-01 00:04:12: WLAN signal strength: -88 dBm
1970-01-01 00:04:12: ----
1970-01-01 00:04:12: Connection lost, reconnecting ....................
1970-01-01 00:04:22: Still no WiFi, turn off...
1970-01-01 00:04:24: WiFi, reconnecting
1970-01-01 00:04:25: ....................Not succeeded. HTTP status code: -1
1970-01-01 00:04:35: Not succeeded. HTTP status code: -1
1970-01-01 00:04:35: ## Sending to madavi.de: 
1970-01-01 00:04:35: Not succeeded. HTTP status code: -1
1970-01-01 00:04:35: Time for sending data (ms): 0
1970-01-01 00:04:45: SDS011 reply checksum failed 
1970-01-01 00:04:50: SDS011 reply checksum failed 
1970-01-01 00:05:20: SDS011 reply checksum failed 
1970-01-01 00:05:29: SDS011 reply checksum failed 
1970-01-01 00:06:12: SDS011 reply checksum failed 
1970-01-01 00:06:30: SDS011 reply checksum failed 
1970-01-01 00:06:34: SDS011 reply checksum failed 
1970-01-01 00:06:55: SDS011: end of cycle
1970-01-01 00:07:00: Creating data string:
1970-01-01 00:07:00: WLAN signal strength: -86 dBm
1970-01-01 00:07:00: ----
1970-01-01 00:07:00: Connection lost, reconnecting ....................
1970-01-01 00:07:10: Still no WiFi, turn off...
1970-01-01 00:07:12: WiFi, reconnecting
1970-01-01 00:07:13: ....................Not succeeded. HTTP status code: -1
1970-01-01 00:07:23: Not succeeded. HTTP status code: -1
1970-01-01 00:07:23: ## Sending to madavi.de: 
1970-01-01 00:07:23: Not succeeded. HTTP status code: -1
1970-01-01 00:07:23: Time for sending data (ms): 0
1970-01-01 00:07:58: SDS011 reply checksum failed 
1970-01-01 00:08:54: SDS011 reply checksum failed 
1970-01-01 00:09:27: SDS011 reply checksum failed 
1970-01-01 00:09:43: SDS011: end of cycle
1970-01-01 00:09:48: Creating data string:
1970-01-01 00:09:48: WLAN signal strength: -90 dBm
1970-01-01 00:09:48: ----
1970-01-01 00:09:48: Connection lost, reconnecting ....................
1970-01-01 00:09:58: Still no WiFi, turn off...
1970-01-01 00:10:00: WiFi, reconnecting
1970-01-01 00:10:01: ....................Not succeeded. HTTP status code: -1
1970-01-01 00:10:10: Not succeeded. HTTP status code: -1
1970-01-01 00:10:10: ## Sending to madavi.de: 
1970-01-01 00:10:10: Not succeeded. HTTP status code: -1
1970-01-01 00:10:10: Time for sending data (ms): 0
1970-01-01 00:10:34: SDS011 reply checksum failed 
1970-01-01 00:10:47: SDS011 reply checksum failed 
1970-01-01 00:11:16: SDS011 reply checksum failed 
1970-01-01 00:11:26: SDS011 reply checksum failed 
1970-01-01 00:11:42: SDS011 reply checksum failed 
1970-01-01 00:12:01: SDS011 reply checksum failed 
1970-01-01 00:12:06: SDS011 reply checksum failed 
1970-01-01 00:12:30: SDS011: end of cycle
1970-01-01 00:12:35: Creating data string:
1970-01-01 00:12:35: WLAN signal strength: -90 dBm
1970-01-01 00:12:35: ----
1970-01-01 00:12:35: Connection lost, reconnecting ....................
1970-01-01 00:12:45: Still no WiFi, turn off...
1970-01-01 00:12:47: WiFi, reconnecting
1970-01-01 00:12:48: ....................Not succeeded. HTTP status code: -1
1970-01-01 00:12:57: Not succeeded. HTTP status code: -1
1970-01-01 00:12:58: ## Sending to madavi.de: 
1970-01-01 00:12:58: Not succeeded. HTTP status code: -1
1970-01-01 00:12:58: Time for sending data (ms): 0
1970-01-01 00:14:02: SDS011 reply checksum failed 
1970-01-01 00:15:17: SDS011: end of cycle
1970-01-01 00:15:23: Creating data string:
1970-01-01 00:15:23: WLAN signal strength: -89 dBm
1970-01-01 00:15:23: ----
1970-01-01 00:15:23: Connection lost, reconnecting ....................
1970-01-01 00:15:33: Still no WiFi, turn off...
1970-01-01 00:15:35: WiFi, reconnecting
1970-01-01 00:15:35: ....................Not succeeded. HTTP status code: -1
1970-01-01 00:15:45: Not succeeded. HTTP status code: -1
1970-01-01 00:15:45: ## Sending to madavi.de: 
1970-01-01 00:15:45: Not succeeded. HTTP status code: -1
1970-01-01 00:15:45: Time for sending data (ms): 0
1970-01-01 00:16:24: SDS011 reply checksum failed 
1970-01-01 00:17:09: SDS011 reply checksum failed 
1970-01-01 00:18:05: SDS011: end of cycle
1970-01-01 00:18:10: Creating data string:
1970-01-01 00:18:10: WLAN signal strength: -91 dBm
1970-01-01 00:18:10: ----
1970-01-01 00:18:10: Connection lost, reconnecting ....

Currently, I'm testing alpha firmware NAMF-2020-46rc2 with manual override for network type - WiFi Phy Mode (1=B / 2=G / 3=N) set to 2, which is funny on 802.11ax router, but seems to make it stable so far.

https://github.com/nettigo/namf/commit/b5df55dc25a9cae38fc0ee2fd0c79dbc104a1319 may be solving the issue if that's just a case of buffer issues. I got at least two similar crashes recorded, but the sensor rebooted with IP address, so it looks promising.

Here's an example of level=5 log from alpha firmware where it crashes (however not right after SDS issue) and reboots with Wi-Fi working. It was alive for roughly 45 minutes.

-----> Received 174 Bytes:
2023-12-04 13:18:31 2023-12-04 13:18:31: ****************** Upload data to APIs*****************************
2023-12-04 13:18:31: Call sensorDHT22
2023-12-04 13:18:31: Start reading DHT11/22

-----> Received 288 Bytes:
2023-12-04 13:18:31 2023-12-04 13:18:31: Temperature: 14.00°C
2023-12-04 13:18:31: Humidity: 39.20%
2023-12-04 13:18:31: ----
2023-12-04 13:18:31: End reading DHT11/22
2023-12-04 13:18:31: Creating data string:
2023-12-04 13:18:31: WLAN signal strength: -88 dBm
2023-12-04 13:18:31: ----
2023-12-04 1

-----> Received 396 Bytes:
2023-12-04 13:18:32 3:18:31: ## Sending to Sensor Community (DHT): 
2023-12-04 13:18:31: Start connecting to api.sensor.community
2023-12-04 13:18:31: api.sensor.community
2023-12-04 13:18:31: 443
2023-12-04 13:18:31: /v1/push-sensor-data/
2023-12-04 13:18:31: {"software_version": "NAMF-2020-46rc2", "sensordatavalues":[{"value_type":"temperature","value":"14.00"},{"value_type":"humidity","value":"39.20"}]}

-----> Received 692 Bytes:
2023-12-04 13:18:36 2023-12-04 13:18:35: Succeeded - 
2023-12-04 13:18:35: Request result: 201
2023-12-04 13:18:35: Details:
2023-12-04 13:18:35: {"sensor":21085,"timestamp":"2023-12-04T12:18:35.917091","sensordatavalues":[{"sensordata":18233008274},{"sensordata":18233008274}]}
2023-12-04 13:18:35: End connecting to api.sensor.community
2023-12-04 13:18:35: No data sent...
2023-12-04 13:18:35: Start connecting to api.sensor.community
2023-12-04 13:18:35: api.sensor.community
2023-12-04 13:18:35: 443
2023-12-04 13:18:35: /v1/push-sensor-data/
2023-12-04 13:18:35: {"software_version": "NAMF-2020-46rc2", "sensordatavalues":[{"value_type":"P1","value":"6.40"},{"value_type":"P2","value":"3.50"}]}

-----> Received 754 Bytes:
2023-12-04 13:18:37 2023-12-04 13:18:37: Succeeded - 
2023-12-04 13:18:37: Request result: 201
2023-12-04 13:18:37: Details:
2023-12-04 13:18:37: {"sensor":21084,"timestamp":"2023-12-04T12:18:37.654396","sensordatavalues":[{"sensordata":18233008735},{"sensordata":18233008735}]}
2023-12-04 13:18:37: End connecting to api.sensor.community
2023-12-04 13:18:37: No data sent...
2023-12-04 13:18:37: Start connecting to api.sensor.community
2023-12-04 13:18:37: api.sensor.community
2023-12-04 13:18:37: 443
2023-12-04 13:18:37: /v1/push-sensor-data/
2023-12-04 13:18:37: {"software_version": "NAMF-2020-46rc2", "sensordatavalues":[{"value_type":"temperature","value":"1.24"},{"value_type":"pressure","value":"99687.30"},{"value_type":"humidity","value":"100.00"}]}

-----> Received 1005 Bytes:
2023-12-04 13:18:40 2023-12-04 13:18:39: Succeeded - 
2023-12-04 13:18:39: Request result: 201
2023-12-04 13:18:40: Details:
2023-12-04 13:18:40: {"sensor":21086,"timestamp":"2023-12-04T12:18:39.237015","sensordatavalues":[{"sensordata":18233008789},{"sensordata":18233008789},{"sensordata":18233008789}]}
2023-12-04 13:18:40: End connecting to api.sensor.community
2023-12-04 13:18:40: ## Sending to madavi.de: 
2023-12-04 13:18:40: Start connecting to api-rrd.madavi.de
2023-12-04 13:18:40: api-rrd.madavi.de
2023-12-04 13:18:40: 443
2023-12-04 13:18:40: /data.php
2023-12-04 13:18:40: {"software_version": "NAMF-2020-46rc2", "sensordatavalues":[{"value_type":"temperature","value":"14.00"},{"value_type":"humidity","value":"39.20"},{"value_type":"SDS_P1","value":"6.40"},{"value_type":"SDS_P2","value":"3.50"},{"value_type":"BME280_temperature","value":"1.24"},{"value_type":"BME280_pressure","value":"99687.30"},{"value_type":"BME280_humidity","value":"100.00"},{"value_type":"samples","value":"109613"},{"value

-----> Received 119 Bytes:
2023-12-04 13:18:40 _type":"min_micro","value":"280"},{"value_type":"max_micro","value":"125157"},{"value_type":"signal","value":"-88"}]}

-----> Received 57 Bytes:
2023-12-04 13:18:42 rl dœŸ| Œl༃Œl섢|Žƒ䛛{›cŒcŒ򧧞l'nœ㬄cp䇬slsd

-----> Received 160 Bytes:
2023-12-04 13:18:42 pûoЃƒlŒœcn㼃䄏cŒûggŒŽl ˜nnd`o{›ۧclœdp󮠐{Œœœœ€Œcg㼃c„ûool`˜ggd`o{››o#Œ’`s““'c„›`œllœ㘌l`üƒ'œl

-----> Received 57 Bytes:
2023-12-04 13:18:42 {d dœŸ| „l༃Œl쌣|‡ƒ䓓s›c„cŒûogŸlggœ㤌bp„dslsl

-----> Received 160 Bytes:
2023-12-04 13:18:42 x󮠐ƒ$Œœcg㼃䄏c„ûooŒ‡l`˜ggd`n{››oclœdx󧠘s„œœœ€Œcg㼂Žc„󮮧 $`ggl`'sŽ““ncě`r’’gcŒ“`œddœ␌l`ü‚gœd

-----> Received 34 Bytes:
2023-12-04 13:18:42 1970-01-01 00:00:00: mounting FS: 

-----> Received 71 Bytes:
2023-12-04 13:18:42 OK
1970-01-01 00:00:00: 
FACTORY RESET start - press reset two times

-----> Received 259 Bytes:
2023-12-04 13:18:42 1970-01-01 00:00:00: 
NAMF ver: NAMF-2020-46rc2/EN
1970-01-01 00:00:00: Chip ID: 1234567
1970-01-01 00:00:00: SPIFFS (kB): 1907
1970-01-01 00:00:00: Free sketch space (kB): 1412
1970-01-01 00:00:00: CPU freq (MHz): 160
1970-01-01 00:00:00: Set defaults

-----> Received 106 Bytes:
2023-12-04 13:18:43 1970-01-01 00:00:00: SDS011: start
1970-01-01 00:00:00: Trying BME280 sensor on 
1970-01-01 00:00:00: 76

-----> Received 12 Bytes:
2023-12-04 13:18:43  ... found

-----> Received 306 Bytes:
2023-12-04 13:18:43 1970-01-01 00:00:01: BME280: start
1970-01-01 00:00:01: Network wtchd 'process'
1970-01-01 00:00:01: Config parsed
1970-01-01 00:00:01: PowerOnTest
1970-01-01 00:00:01: Read DHT...
1970-01-01 00:00:01: output debug text to displays...
1970-01-01 00:00:01: SSID: 'XXXXXXXXXX'
1970-01-01 00:00:01: 7

-----> Received 22 Bytes:
2023-12-04 13:18:44 1970-01-01 00:00:01: .

-----> Received 1 Byte:
2023-12-04 13:18:44 .
2023-12-04 13:18:45 .
2023-12-04 13:18:45 .
2023-12-04 13:18:46 .
2023-12-04 13:18:47 .

-----> Received 206 Bytes:
2023-12-04 13:18:47 .
1970-01-01 00:00:05: WiFi connected
IP address: 192.168.XXX.XXX
1970-01-01 00:00:05: Setting time using SNTP
1970-01-01 00:00:05: Thu Jan  1 00:00:05 1970

1970-01-01 00:00:05: 0.europe.pool.ntp.org

-----> Received 89 Bytes:
2023-12-04 13:18:48 2023-12-04 13:18:48: .Mon Dec  4 13:18:48 2023

2023-12-04 13:18:48: NTP time received

-----> Received 184 Bytes:
2023-12-04 13:18:49 2023-12-04 13:18:48: Starting Webserver... 192.168.XXX.XXX
2023-12-04 13:18:48: Send to :
2023-12-04 13:18:48: luftdaten.info
2023-12-04 13:18:48: Madavi.de
2023-12-04 13:18:48: 

-----> Received 50 Bytes:
2023-12-04 13:18:49 2023-12-04 13:18:49: Clear factory reset markers

-----> Received 52 Bytes:
2023-12-04 13:18:49 2023-12-04 13:18:49: SDS011 reply checksum failed 

The most recent tweak I'm testing is related to comments in https://github.com/nettigo/namf/issues/59 - I've set measurement time to 10s. Additionally, I've enabled "Hardware SDS restarter".

It didn't prevent crash, but it still is able to connect.

-----> Received 653 Bytes:
2023-12-04 13:33:56 2023-12-04 13:33:56: Succeeded - 
2023-12-04 13:33:56: Request result: 201
2023-12-04 13:33:56: Details:
2023-12-04 13:33:56: {"sensor":21086,"timestamp":"2023-12-04T12:33:56.742210","sensordatavalues":[{"sensordata":18233155625},{"sensordata":18233155625},{"sensordata":18233155625}]}
2023-12-04 13:33:56: End connecting to api.sensor.community
2023-12-04 13:33:56: ## Sending to madavi.de: 
2023-12-04 13:33:56: Start connecting to api-rrd.madavi.de
2023-12-04 13:33:56: api-rrd.madavi.de
2023-12-04 13:33:56: 443
2023-12-04 13:33:56: /data.php
2023-12-04 13:33:56: {"software_version": "NAMF-2020-46rc2", "sensordatavalues":[{"value_type":"

-----> Received 470 Bytes:
2023-12-04 13:33:56 temperature","value":"14.00"},{"value_type":"humidity","value":"37.70"},{"value_type":"SDS_P1","value":"4.00"},{"value_type":"SDS_P2","value":"2.90"},{"value_type":"BME280_temperature","value":"1.20"},{"value_type":"BME280_pressure","value":"99671.88"},{"value_type":"BME280_humidity","value":"100.00"},{"value_type":"samples","value":"108693"},{"value_type":"min_micro","value":"282"},{"value_type":"max_micro","value":"36672"},{"value_type":"signal","value":"-89"}]}

-----> Received 57 Bytes:
2023-12-04 13:33:58 sl lœŸ| Œd༃„d䌣|ƒ쒒s“cŒb„󯯟dooœ⤌cx쎬{d;l

-----> Received 160 Bytes:
2023-12-04 13:33:59 x󮠐ƒdŒœcg㼃섏c„ûooŒ‡l`˜ggd`n{››oclœdx󧠘s„œœœ€Œcg㼂Žc„󮮧 d`ggl`gsŽ““ncě`;’’gcŒ“`œddœ␌l`ü‚g܏d

-----> Received 57 Bytes:
2023-12-04 13:33:59 {d dœŸ| „l༃Œl쌣|‡ƒ䓓{›c„cŒûggŸlggœ㤄bp䏤slsl

-----> Received 160 Bytes:
2023-12-04 13:33:59 p󮠐ƒlŒœcg㼃䄏c„ûooŒ‡l`˜ggd`o{››oblœdx󧠐{„œœœ€Œcg㼃Žb„󮯧 l`ggl`nsŽ“›ocŒ›`r’’gc„“`œ$lœ␌l`ü‚gœd

-----> Received 34 Bytes:
2023-12-04 13:33:59 1970-01-01 00:00:00: mounting FS: 

-----> Received 71 Bytes:
2023-12-04 13:33:59 OK
1970-01-01 00:00:00: 
FACTORY RESET start - press reset two times

-----> Received 259 Bytes:
2023-12-04 13:33:59 1970-01-01 00:00:00: 
NAMF ver: NAMF-2020-46rc2/EN
1970-01-01 00:00:00: Chip ID: 1234567
1970-01-01 00:00:00: SPIFFS (kB): 1907
1970-01-01 00:00:00: Free sketch space (kB): 1412
1970-01-01 00:00:00: CPU freq (MHz): 160
1970-01-01 00:00:00: Set defaults

-----> Received 106 Bytes:
2023-12-04 13:34:00 1970-01-01 00:00:00: SDS011: start
1970-01-01 00:00:00: Trying BME280 sensor on 
1970-01-01 00:00:00: 76

-----> Received 12 Bytes:
2023-12-04 13:34:00  ... found

-----> Received 544 Bytes:
2023-12-04 13:34:00 1970-01-01 00:00:01: BME280: start
1970-01-01 00:00:01: Network wtchd 'process'
1970-01-01 00:00:01: Config parsed
1970-01-01 00:00:01: {"current_lang":"EN","SOFTWARE_VERSION":"NAMF-2020-46rc2","wlanssid":"XXXXXXXXXX","wlanpwd":"XXXXXXXXXXXX","fbpwd":"XXXXXXXXXXXX","fs_pwd":"","www_username":"admin","www_password":"admin","fs_ssid":"NAM-1234567","fbssid":"XXXXXXXXXX","www_basicauth_enabled":"false","dht_read":"true","pms_read":"false","ds18b20_read":"false","gps_read":"false","send2dusti":"true","ssl_dusti":"true","send2madavi":"true",

-----> Received 1267 Bytes:
2023-12-04 13:34:00 "ssl_madavi":"true","send2sensemap":"false","send2fsapp":"false","send2lora":"false","send2csv":"false","auto_update":"false","update_channel":"0","has_display":"false","has_lcd1602":"false","has_lcd1602_27":"false","has_lcd2004_27":"false","has_lcd2004_3f":"false","show_wifi_info":"false","sh_dev_inf":"false","has_ledbar_32":"false","debug":"4","send_diag":"true","sending_intervall_ms":"145000","time_for_wifi_config":"30000","outputPower":"20.50","phyMode":"2","senseboxid":"","send2custom":"false","host_custom":"192.168.234.1","url_custom":"/data.php","port_custom":"80","user_custom":"","pwd_custom":"","send2aqi":"false","token_AQI":"","host_custom":"192.168.234.1","send2influx":"false","host_influx":"influx.server","url_influx":"/write?db=luftdaten","UUID":"c30fe713-adbc-4b5a-9f3b-3a3e9a34f2ba","port_influx":"8086","user_influx":"","pwd_influx":"","sensors":{"SDS011":{"e":1,"r":"10000","dbg":"1"},"HECA":{"e":0},"BME280":{"e":1,"d":1},"SHT3x":{"e":0,"d":0},"BMPx80":{"e":0},"SPS30":{"e":0,"refresh":"10"},"NTW_WTD":{"e":1,"ip":"192.168.XXX.X"},"MHZ14A":{"e":0}}}
1970-01-01 00:00:01: PowerOnTest
1970-01-01 00:00:01: Read DHT...
1970-01-01 00:00:01: output debug text to displays...
1970-01-01 00:00:01: SSID: 'XXXXXXXXXX'
1970-01-01 00:00:01: 7

-----> Received 22 Bytes:
2023-12-04 13:34:01 1970-01-01 00:00:01: .

-----> Received 1 Byte:
2023-12-04 13:34:01 .
2023-12-04 13:34:02 .
2023-12-04 13:34:02 .
2023-12-04 13:34:03 .
2023-12-04 13:34:04 .
2023-12-04 13:34:04 .
2023-12-04 13:34:05 .

-----> Received 206 Bytes:
2023-12-04 13:34:06 .
1970-01-01 00:00:06: WiFi connected
IP address: 192.168.XXX.XXX
1970-01-01 00:00:06: Setting time using SNTP
1970-01-01 00:00:06: Thu Jan  1 00:00:06 1970

1970-01-01 00:00:06: 0.europe.pool.ntp.org

-----> Received 22 Bytes:
2023-12-04 13:34:06 1970-01-01 01:00:07: .

-----> Received 1 Byte:
2023-12-04 13:34:07 .
2023-12-04 13:34:07 .
2023-12-04 13:34:08 .
2023-12-04 13:34:08 .
2023-12-04 13:34:08 .
2023-12-04 13:34:09 .
2023-12-04 13:34:09 .
2023-12-04 13:34:10 .
2023-12-04 13:34:10 .
2023-12-04 13:34:11 .
2023-12-04 13:34:11 .
2023-12-04 13:34:12 .
2023-12-04 13:34:12 .
2023-12-04 13:34:13 .
2023-12-04 13:34:13 .
2023-12-04 13:34:14 .
2023-12-04 13:34:15 .
2023-12-04 13:34:15 .

-----> Received 19 Bytes:
2023-12-04 13:34:16 .
router/gateway:

-----> Received 22 Bytes:
2023-12-04 13:34:16 1970-01-01 00:00:17: .

-----> Received 1 Byte:
2023-12-04 13:34:16 .
2023-12-04 13:34:17 .
2023-12-04 13:34:17 .
2023-12-04 13:34:18 .
2023-12-04 13:34:18 .
2023-12-04 13:34:19 .
2023-12-04 13:34:19 .
2023-12-04 13:34:20 .
2023-12-04 13:34:20 .
2023-12-04 13:34:21 .
2023-12-04 13:34:21 .
2023-12-04 13:34:22 .
2023-12-04 13:34:22 .
2023-12-04 13:34:23 .
2023-12-04 13:34:23 .
2023-12-04 13:34:24 .
2023-12-04 13:34:25 .
2023-12-04 13:34:25 .

-----> Received 24 Bytes:
2023-12-04 13:34:26 .NTP time not received

-----> Received 160 Bytes:
2023-12-04 13:34:26 1970-01-01 00:00:27: Starting Webserver... 192.168.XXX.XXX
1970-01-01 00:00:27: Send to :
1970-01-01 00:00:27: luftdaten.info
1970-01-01 00:00:27: Madavi.de

-----> Received 74 Bytes:
2023-12-04 13:34:26 
1970-01-01 00:00:27: 
1970-01-01 00:00:27: Clear factory reset markers

-----> Received 52 Bytes:
2023-12-04 13:34:27 1970-01-01 00:00:27: SDS011 reply checksum failed 

-----> Received 174 Bytes:
2023-12-04 13:35:00 1970-01-01 00:01:01: Network wtchd 'process'
1970-01-01 00:01:01: PIIIING processor
1970-01-01 00:01:01: PING connectivity check:
1970-01-01 00:01:01: Next ping state: 1

-----> Received 181 Bytes:
2023-12-04 13:35:12 1970-01-01 00:01:13: Network wtchd 'process'
1970-01-01 00:01:13: PIIIING processor
1970-01-01 00:01:13: PING connectivity check state:1
1970-01-01 00:01:13: Next ping state: 0

-----> Received 298 Bytes:
2023-12-04 13:35:19 1970-01-01 00:01:19: output data json...
1970-01-01 00:01:19: replace with: , "age":"4294875", "measurements":"0", "uptime":"79", "sensordatavalues"
1970-01-01 00:01:19: replaced: {"software_version": "NAMF-2020-46rc2", "age":"4294875", "measurements":"0", "uptime":"79", "sensordatavalues":[]}

-----> Received 42 Bytes:
2023-12-04 13:35:38 1970-01-01 00:01:38: output root page...

-----> Received 88 Bytes:
2023-12-04 13:35:39 1970-01-01 00:01:40: output config page ...
1970-01-01 00:01:40: output config page 1

-----> Received 52 Bytes:
2023-12-04 13:36:25 1970-01-01 00:02:26: SDS011 reply checksum failed 

2023-12-04 13:36:29 1970-01-01 00:02:30: SDS011 reply checksum failed 

2023-12-04 13:36:31 1970-01-01 00:02:32: SDS011 reply checksum failed 

2023-12-04 13:36:35 1970-01-01 00:02:36: SDS011 reply checksum failed 

2023-12-04 13:36:41 1970-01-01 00:02:42: SDS011 reply checksum failed 

-----> Received 43 Bytes:
2023-12-04 13:36:46 1970-01-01 00:02:47: SDS011: end of cycle

-----> Received 359 Bytes:
2023-12-04 13:36:51 1970-01-01 00:02:52: Temperature: 14.10°C
1970-01-01 00:02:52: Humidity: 38.00%
1970-01-01 00:02:52: ----
1970-01-01 00:02:52: Creating data string:
1970-01-01 00:02:52: WLAN signal strength: -87 dBm
1970-01-01 00:02:52: ----
1970-01-01 00:02:52: ## Sending to Sensor Community (DHT): 
1970-01-01 00:02:52: Time incorrect; Disabling CA verification.

-----> Received 35 Bytes:
2023-12-04 13:36:53 1970-01-01 00:02:53: Succeeded - 

-----> Received 65 Bytes:
2023-12-04 13:36:53 1970-01-01 00:02:53: Time incorrect; Disabling CA verification.

-----> Received 100 Bytes:
2023-12-04 13:36:54 1970-01-01 00:02:54: Succeeded - 
1970-01-01 00:02:54: Time incorrect; Disabling CA verification.

-----> Received 148 Bytes:
2023-12-04 13:36:55 1970-01-01 00:02:56: Succeeded - 
1970-01-01 00:02:56: ## Sending to madavi.de: 
1970-01-01 00:02:56: Time incorrect; Disabling CA verification.

-----> Received 35 Bytes:
2023-12-04 13:36:56 1970-01-01 00:02:56: Succeeded - 

-----> Received 57 Bytes:
2023-12-04 13:36:56 sl lœŸ| Œd༃„d䄣|ƒ쒒s“cŒb„󮯟dooœ⬌cx쇬{d{l

-----> Received 160 Bytes:
2023-12-04 13:36:56 pûoЃÌlŒœbn㼃䄏cŒ򧧯 lŒŽd`ؓnnd`o{ےgclœlp󮠐{Œœœœ€Œcg㼃c„ûool`˜ggd`o{››obŒ’`s““nc„›`œllœ㘌l`üƒnœl

-----> Received 99 Bytes:
2023-12-04 13:36:57 {d lœŸ| „l༃Œl쌣|‡ƒ쓓s“c„cĻooŸlggܣ䌣p„drlslpûg؂ƒl„œco⼃쌇˜cŒ󧮧 d„d`

-----> Received 118 Bytes:
2023-12-04 13:36:57 ool`gs‡““'cdœlpûoЃrŒܜœ€„bo㼃‡cŒ򧧯 l`ood`or‡’’gcŒ“`{››ocŒ۠œllœ㐄d`üƒoœl

-----> Received 34 Bytes:
2023-12-04 13:36:57 1970-01-01 00:00:00: mounting FS: 

-----> Received 71 Bytes:
2023-12-04 13:36:57 OK
1970-01-01 00:00:00: 
FACTORY RESET start - press reset two times

-----> Received 259 Bytes:
2023-12-04 13:36:57 1970-01-01 00:00:00: 
NAMF ver: NAMF-2020-46rc2/EN
1970-01-01 00:00:00: Chip ID: 1234567
1970-01-01 00:00:00: SPIFFS (kB): 1907
1970-01-01 00:00:00: Free sketch space (kB): 1412
1970-01-01 00:00:00: CPU freq (MHz): 160
1970-01-01 00:00:00: Set defaults

-----> Received 106 Bytes:
2023-12-04 13:36:57 1970-01-01 00:00:00: SDS011: start
1970-01-01 00:00:00: Trying BME280 sensor on 
1970-01-01 00:00:00: 76

-----> Received 12 Bytes:
2023-12-04 13:36:58  ... found

-----> Received 672 Bytes:
2023-12-04 13:36:58 1970-01-01 00:00:01: BME280: start
1970-01-01 00:00:01: Network wtchd 'process'
1970-01-01 00:00:01: Config parsed
1970-01-01 00:00:01: {"current_lang":"EN","SOFTWARE_VERSION":"NAMF-2020-46rc2","wlanssid":"XXXXXXXXXX","wlanpwd":"XXXXXXXXXXXX","fbpwd":"XXXXXXXXXXXX","fs_pwd":"","www_username":"admin","www_password":"admin","fs_ssid":"NAM-1234567","fbssid":"XXXXXXXXXX","www_basicauth_enabled":"false","dht_read":"true","pms_read":"false","ds18b20_read":"false","gps_read":"false","send2dusti":"true","ssl_dusti":"true","send2madavi":"true","ssl_madavi":"true","send2sensemap":"false","send2fsapp":"false","send2lora":"false","send2csv":"false","auto_update":"false","u

-----> Received 1139 Bytes:
2023-12-04 13:36:58 pdate_channel":"0","has_display":"false","has_lcd1602":"false","has_lcd1602_27":"false","has_lcd2004_27":"false","has_lcd2004_3f":"false","show_wifi_info":"false","sh_dev_inf":"false","has_ledbar_32":"false","debug":"4","send_diag":"true","sending_intervall_ms":"145000","time_for_wifi_config":"30000","outputPower":"20.50","phyMode":"2","senseboxid":"","send2custom":"false","host_custom":"192.168.234.1","url_custom":"/data.php","port_custom":"80","user_custom":"","pwd_custom":"","send2aqi":"false","token_AQI":"","host_custom":"192.168.234.1","send2influx":"false","host_influx":"influx.server","url_influx":"/write?db=luftdaten","UUID":"c30fe713-adbc-4b5a-9f3b-3a3e9a34f2ba","port_influx":"8086","user_influx":"","pwd_influx":"","sensors":{"SDS011":{"e":1,"r":"10000","dbg":"1"},"HECA":{"e":0},"BME280":{"e":1,"d":1},"SHT3x":{"e":0,"d":0},"BMPx80":{"e":0},"SPS30":{"e":0,"refresh":"10"},"NTW_WTD":{"e":1,"ip":"192.168.XXX.X"},"MHZ14A":{"e":0}}}
1970-01-01 00:00:01: PowerOnTest
1970-01-01 00:00:01: Read DHT...
1970-01-01 00:00:01: output debug text to displays...
1970-01-01 00:00:01: SSID: 'XXXXXXXXXX'
1970-01-01 00:00:01: 7

-----> Received 22 Bytes:
2023-12-04 13:36:59 1970-01-01 00:00:01: .

-----> Received 1 Byte:
2023-12-04 13:36:59 .
2023-12-04 13:37:00 .
2023-12-04 13:37:00 .
2023-12-04 13:37:01 .
2023-12-04 13:37:02 .

-----> Received 206 Bytes:
2023-12-04 13:37:02 .
1970-01-01 00:00:05: WiFi connected
IP address: 192.168.XXX.XXX
1970-01-01 00:00:05: Setting time using SNTP
1970-01-01 00:00:05: Thu Jan  1 00:00:05 1970

1970-01-01 00:00:05: 0.europe.pool.ntp.org

-----> Received 89 Bytes:
2023-12-04 13:37:03 2023-12-04 13:37:03: .Mon Dec  4 13:37:03 2023

2023-12-04 13:37:03: NTP time received

-----> Received 64 Bytes:
2023-12-04 13:37:03 2023-12-04 13:37:03: Starting Webserver... 192.168.XXX.XXX
2023

-----> Received 170 Bytes:
2023-12-04 13:37:04 -12-04 13:37:03: Send to :
2023-12-04 13:37:03: luftdaten.info
2023-12-04 13:37:03: Madavi.de
2023-12-04 13:37:03: 
2023-12-04 13:37:04: Clear factory reset markers

-----> Received 52 Bytes:
2023-12-04 13:37:04 2023-12-04 13:37:04: SDS011 reply checksum failed 

To sum it up:

danielskowronski commented 7 months ago

It seems like alpha firmware solves freezing of Wi-Fi after crash.

Crashes are 99% hardware related - I discovered a terrible voltage drop on my USB cable (3.6V on 5V test point), USB power supply is quite good. It was probably amplified by temperature around -7 degrees Celsius, so the heater was drawing significant current, dropping voltage further.

Before that change, I had around 24 crashes in less than 10 hours. Will be monitoring overnight and for a few more days.

Maybe we could add monitoring of 5V rail with analogue input A0 and some simple discrete voltage divider or diode? AFAIK ESP8266 should be able to take 3.3V on the built-in ADC, so diagnostic hack should be doable.

netmaniac commented 7 months ago

It seems like alpha firmware solves freezing of Wi-Fi after crash.

Crashes are 99% hardware related - I discovered a terrible voltage drop on my USB cable (3.6V on 5V test point), USB power supply is quite good. It was probably amplified by temperature around -7 degrees Celsius, so the heater was drawing significant current, dropping voltage further.

Before that change, I had around 24 crashes in less than 10 hours. Will be monitoring overnight and for a few more days.

Maybe we could add monitoring of 5V rail with analogue input A0 and some simple discrete voltage divider or diode? AFAIK ESP8266 should be able to take 3.3V on the built-in ADC, so diagnostic hack should be doable.

Yes, power supply is crucial. Regarding drop of voltage - if You use USB PS and USB cable to power NAM - then USB cable can be a problem. Heater is PTC, powered by 5V it takes ~700 mA when heater is cold. Current then becomes much lower (100-200 mA AFAIK) when heater has warmed up. So, many cheap USB cables (micro USB especially) have to thin power wires and is huge voltage drop on cable.

Regarding SDS011 checksum failures - they are unavoidable. With software serial on ESP8266 we will always get some timing errors. So failure rate up to 10% is totally OK.

In other words - checksum errors in log are pretty common and nothing to worry about.

SDS011 and it's lifetime - on standard settings (145 measurement interval, 20 s warmup + measure time) it should last 6-7 years.

danielskowronski commented 5 months ago

TBH, I couldn't find so cheap USB power supply in my collection - worst got 5V/1A and delivered it. I've personally touched just one so terrible that it didn't conform with USB standards, but it was more of fire hazard ;)

After some checks, I found out that the cable I used originally delivered 3.7V (1.4V drop), better one 4.2V and best ones I could find (1.2m GreenCell KABGC20, 2m Baseus CAMKLF-CG1) - 4.3V-4.4V (0.7V-0.8V drop). However, it turned out that the reason for my Wi-Fi instability was bugged ASUS-WRT Merlin firmware...

Out of curiosity, I did some checks to find cut-off voltage with a regulated power supply set to constant voltage and voltage measured at screw terminals. It's relatively warm now, and I couldn't find space in my freezer, so measured at ambient 22 C with PTC heating to 35-40 C. All the time current was 0.3A.

I hope to publish more details soon, but it seems like you'd need to have a very shitty charger (most modern ones support 2.4A) or broken shitty cable (with voltage drop between 1.2V on lowest end of semi-acceptable USB voltage and 1.8V on higher end of range). I can see a scenario when the charger struggles with 0.5A and cable used is of very poor quality, but it seems like it should yield much more spectacular results - unable to connect at all, absurd results or much higher checksum error rates. Plus, it's 2024, I don't believe you wouldn't throw away such a charger.

As for Wi-Fi connection issues, I have no clue what could be the reason as I tried multiple options on both ends, and it only went away after roll-back to stock Asus firmware, but keeping all the settings the same. Now it connects to network after few seconds and gets reasonable ping (3-20ms, some much longer but usually when device exports data), before change it was taking forever to connect, most of the time falling back to AP mode and when it connected it got DHCP lease but wasn't reachable at all. Nothing physical changed - same antennas, same cables, same window and position. Wi-Fi seems to work on so low voltage, you should be able to diagnose power issues that way.