opendata-stuttgart / sensors-software

sourcecode for reading sensor data
571 stars 308 forks source link

NRZ-2017-100-B7. Unstable #164

Closed dokape closed 6 years ago

dokape commented 6 years ago

NRZ-2017-100-B7 seems to be unstable.

Testing over 2 hours: not more than 10 measurements/cycles are run, then a reboot is done. Sensor is not touched within the time and connected by 2A powersupply.

DHT22 values are lost some cycles SDS011 values are lost some cycles

Sometimes after reboot the connection to WLAN seems lost, no ping available, http not reacheable.

Several sensors are connected: SDS011 DHT22 HTU21 BME280 OLED LCD GPS NEO 6M

ricki-z commented 6 years ago

Where do you send the data to? And can you save a crash dump? You should get the crash dump if you record the debug output on the serial console.

dokape commented 6 years ago

Data is send to network-intern raspi, to simple-data.PHP.

I will try to get a crash-dump.

Well. At the Moment it is running through... 40 cycles eithout reboot, but loosing the connection to the DHT22. Maybee an unstable cable? Will check. Or the USB of my surface has not enough power?

It's not a clear and reproducable error.

ricki-z commented 6 years ago

The DHT has a power spec of 3.2 to 5 V. So in some cases it may be better to connect the DHT22 to the 5V pin. The voltage converter may not provide the needed 3.2 V.

ricki-z commented 6 years ago

Another source for crashes is the GPS. We need to use 2 SoftSerial ports. This is very cpu intensive (interrupts). And I couldn't find any information about the size and the count of GPS messages per second. Maybe the defined buffer is to small to handle the GPS messages in a proper way.

Adorfer commented 6 years ago

in default configuration, a GNSS receiver gives out one NMEA datagram per second, this results to upto 500 character per second, which will fill up a 9600bps link already 50%. see https://en.wikipedia.org/wiki/NMEA_0183

some receivers like to do position update rates of 4Hz or even 10Hz. This requires usually higher bitrates.

If we consider this type of io-polling load (and the fact that the CPU has to deal with all the wifi stuff in a parallel task) as critical, we should perhaps consider to fork at this point to ESP32 for such "bigger setups". (the linked cost just less than a Euro more per device and since those sensors like like GPS do not come for free either, it could be better solution in order to keep development efforts less complicated.)

dokape commented 6 years ago

New Testsystem with soldered connections and plugged modules. (better to remove and reconnect a module)

It seems to be more stable. So unsteable and reboot could be on my old testsystem with plugged cable.

DHT gets 3,27V. Will change a test with 5V later. Using another DHT is no change. Sometimes data, sometimes not. On older beta-versions this was no problem. Will check against older versions later.

It's possible, that the reboots were based on my older testsetup, so I would prefer another tester to see if there the problem also exists.

img_20171112_130232

dokape commented 6 years ago

gecrashed.. Had to check the data.json called in Browser. To see, if next run the PM-Data are shown.

Interesting: HTTP is not reachable. not pingeable. on displays, BME-Data, IP-Adress, GPS-Data are shown.

Dump:

End reading GPS Start reading GPS End reading GPS Start reading SDS011 End reading SDS011 Start reading GPS End reading GPS output data json... last data: {"software_version": "NRZ-2017-100-B7", "sensordatavalues":[{"value_type":"SDS_P1","value":"3.20"},{"value_type":"SDS_P2","value":"1.20"},{"value_type":"temperature","value":"23.30"},{"value_type":"humidity","value":"39.00"},{"value_type":"HTU21D_temperature","value":"23.11"},{"value_type":"HTU21D_humidity","value":"42.23"},{"value_type":"BME280_temperature","value":"23.98"},{"value_type":"BME280_humidity","value":"39.33"},{"value_type":"BME280_pressure","value":"95300.42"},{"value_type":"GPS_lat","value":"47.737197"},{"value_type":"GPS_lon","value":"8.974852"},{"value_type":"GPS_height","value":"47.74"},{"value_type":"GPS_date","value":"11/12/2017"},{"value_type":"GPS_time","value":"12:24:18.00"},{"value_type":"samples","value":"211383"},{"value_type":"min_micro","value":"271"},{"value_type":"max_micro","value":"1724131"},{"value_type":"signal","value":"-41"}]} replace with: , "age":"53", "sensordatavalues" replaced: {"software_version": "NRZ-2017-100-B7", "age":"53", "sensordatavalues":[{"value_type":"SDS_P1","value":"3.20"},{"value_type":"SDS_P2","value":"1.20"},{"value_type":"temperature","value":"23.30"},{"value_type":"humidity","value":"39.00"},{"value_type":"HTU21D_temperature","value":"23.11"},{"value_type":"HTU21D_humidity","value":"42.23"},{"value_type":"BME280_temperature","value":"23.98"},{"value_type":"BME280_humidity","value":"39.33"},{"value_type":"BME280_pressure","value":"95300.42"},{"value_type":"GPS_lat","value":"47.737197"},{"value_type":"GPS_lon","value":"8.974852"},{"value_type":"GPS_height","value":"47.74"},{"value_type":"GPS_date","value":"11/12/2017"},{"value_type":"GPS_time","value":"12:24:18.00"},{"value_type":"samples", Soft WDT reset

ctx: cont sp: 3fff3330 end: 3fff37c0 offset: 01b0

stack>>> 3fff34e0: 3fff6511 00002d98 000005b3 3fff1784
3fff34f0: 00000010 000002e7 3fff67d3 40225194
3fff3500: 4025f9d0 00000009 3fff6c95 3fff1784
3fff3510: 3ffe8488 00000000 3fff26d4 402251c5
3fff3520: 3ffe8488 00000000 3fff3550 402221e5
3fff3530: 3ffe8488 00000000 0000cfbe 4020766c
3fff3540: 3ffe8488 00000000 0000cfbe 402077fc
3fff3550: 00000000 00000000 00000000 3fff6e94
3fff3560: 0000002f 00000020 3fff64ec 0000037f
3fff3570: 00000376 0000000a 3fff35c0 40222ab7
3fff3580: 3fff26ac 000005b3 000005b3 40219dd0
3fff3590: 00000001 00000001 3fff5ef4 40224752
3fff35a0: 00000000 000005b3 3fff5ef4 40219dc6
3fff35b0: 3fff5ef4 3fff1a48 3fff5ef4 40219e02
3fff35c0: 00000000 00000000 00000000 40222c18
3fff35d0: 3fff5ef4 3fff1a48 3fff1a08 40219e95
3fff35e0: 3fff6e7c 0000000f 0000000a 3fff27a0
3fff35f0: 3fff1a48 00000000 3fff27a0 00000001
3fff3600: 00000001 40219010 0000000f 40222a68
3fff3610: 3ffeb568 00000000 3fff3694 3fff1778
3fff3620: 00000001 3fff1a2c 3fff1a08 4021a0cf
3fff3630: 3fffdad0 3ffeb568 3fff3694 40222b3a
3fff3640: 3ffeb568 3fff6fd4 3fff3670 40222a0c
3fff3650: 3ffeb568 3fff1774 3fff3670 402159bc
3fff3660: 40103713 00080000 3fffc258 4000050c
3fff3670: 00000000 00000000 00000000 4000050c
3fff3680: 3fffc278 40103410 3fffc200 00000022
3fff3690: 3fff36a0 3fff6d44 0000000f 00000000
3fff36a0: 3fff6d2c 0000000f 00000000 3fff41d4
3fff36b0: 0000000f 00000000 3fff41bc 0000000f
3fff36c0: 00000000 3fff6e2c 0000000f 00000000
3fff36d0: 3fff42dc 0000000f 00000000 3fff6e64
3fff36e0: 0000000f 00000000 3fff63a4 0000000f
3fff36f0: 00000000 3fff6f74 0000000f 00000000
3fff3700: 3fff4974 0000000f 00000000 3fff6494
3fff3710: 0000000f 00000000 3fff6404 0000000f
3fff3720: 00000000 3fff6464 0000000f 00000000
3fff3730: 3fff632c 0000000f 00000000 3fff4204
3fff3740: 0000000f 00000000 3fff6434 0000000f
3fff3750: 00000000 3fff63d4 0000000f 00000000
3fff3760: 3fff4ae4 0000000f 00000000 3fff6efc
3fff3770: 0000000f 00000000 00000000 00000000
3fff3780: 00000000 3fff3730 00000050 feefeffe
3fff3790: 00000000 00000000 00000001 3fff2790
3fff37a0: 3fffdad0 00000000 3fff2788 40223838
3fff37b0: feefeffe feefeffe 3fff27a0 40100718
<<<stack<<< H!⸮⸮) ⸮

⸮mounting FS... mounted file system... reading config file... opened config file... parsed json... Starting Webserver... 0.0.0.0 output debug text to display... 6 Connecting to Sense .................... WiFi connected IP address: 192.168.200.52

ChipId: 11729146 Read SDS... Read DHT... Read HTU21D... Read BME280... Read GPS... Send to custom API... Show on OLED... Trying BME280 sensor on 76 ... found Stopping SDS011... output data json... replace with: , "age":"4294822", "sensordatavalues" replaced: {"software_version": "NRZ-2017-100-B7", "age":"4294822", "sensordatavalues":[]} Start reading GPS End reading GPS Start reading GPS

ricki-z commented 6 years ago

I'ev pushed B8. In this version I removed an INPUT_PULLUP on the DHT22 pin, that was added after version B4. May be this solves the problem of missing DHT22 data. But with all these components there seems to be too much data in the JSON file as we have to build this 'in memory'.

dokape commented 6 years ago

Well, it is possible to connect all that component. There was no limitation in documentation. So a test with all components have to be done. There will be always a nerd who will connect all possible sensors. ;-)

If there is a limitation of connected sensors so some reason, then this should be documentated, better it should be a warning at the software or selection of sensor-combinations should be impossible, e.g. by radio-buttons.

I could force a reboot of the software by calling the json-data in browser several times after another. I had until this about 10 mesurement-cycles. I will try to reproduce this.

dokape commented 6 years ago

NRZ-2017-100-B8: SDS011 DHT22 HTU21 BME280 OLED LCD running for 3 hours, about 80 measure cylces: seems stable. even calling several times the data.json-file.

I will now test with the NEO-GPS connected again :-) perhaps the serial-data/buffer are the problem.

dokape commented 6 years ago

It's possible to crash the NRZ-2017-100-B8 :-)

SDS011 DHT22 HTU21 BME280 OLED LCD AND GPS Neo 6m

Reload Web-Page /values call /data.json at the same time

call /data.json just at the moment when measurecycle ends/starts and sensor-values are read.

It seems, the GPS and the read-cycles for the soft-serial is not stable. It is visible that the peformance of website is slowing down and also the generation of /data.json is with connected GPS much slower than without GPS.

Perhaps it should be discussed, if e.g. the /data.json should be generated when a GPS is connected.

It seems, the ESP8266 is at the end with performance :-)

I'm not shure, if there should be more optimations should be done at this time. There should be a warning / documentation to a setup like this. e.g. "IMPORTANT: When using GPS, data.json should not be called, reduce the displays and sensors. It's not recommended to connect every possible sensor due to performance / stability reasons"

ricki-z commented 6 years ago

The firmware should now run more stable. I forgot to free some buffers in functions ...

ricki-z commented 6 years ago

And you are right, the firmware should work with all possible sensor connected. Even if this wob't make sense ;-)

dokape commented 6 years ago

NRZ-2017-100-B9:

SDS011 DHT22 HTU21 BME280 OLED LCD

With GPS and Reload of /data.json and reload /values at the same time forces firmware to crash and reboot.

I have to try several times (up to 10 times) to reproduce the crash)

Without GPS it seems stable.

I know, this is still a very special situation. I'm not shure if this situation will be "in the wild" in future. It is open, how many sensors will be fitted with GPS.

I will do a testrun over some hours with that versionand GPS and without forcing a reload. Just sometimes a load of the values-page to control the measurement-count to see how stable it is during "normal" work.


crash-dump

End reading SDS011 Start reading GPS End reading GPS output values to web page... Start reading GPS End reading GPS output data json... last data: {"software_version": "NRZ-2017-100-B9", "sensordatavalues":[{"value_type":"SDS_P1","value":"10.13"},{"value_type":"SDS_P2","value":"5.05"},{"value_type":"temperature","value":"21.80"},{"value_type":"humidity","value":"34.70"},{"value_type":"HTU21D_temperature","value":"21.73"},{"value_type":"HTU21D_humidity","value":"38.48"},{"value_type":"BME280_temperature","value":"22.70"},{"value_type":"BME280_humidity","value":"35.97"},{"value_type":"BME280_pressure","value":"97669.78"},{"value_type":"GPS_lat","value":""},{"value_type":"GPS_lon","value":""},{"value_type":"GPS_height","value":""},{"value_type":"GPS_date","value":"11/14/2017"},{"value_type":"GPS_time","value":"07:04:59.00"},{"value_type":"samples","value":"312436"},{"value_type":"min_micro","value":"242"},{"value_type":"max_micro","value":"1745537"},{"value_type":"signal","value":"-48"}]} replace with: , "age":"30", "sensordatavalues" replaced: {"software_version": "NRZ-2017-100-B9", "age":"30", "sensordatavalues":[{"value_type":"SDS_P1","value":"10.13"},{"value_type":"SDS_P2","value":"5.05"},{"value_type":"temperature","value":"21.80"},{"value_type":"humidity","value":"34.70"},{"value_type":"HTU21D_temperature","value":"21.73"},{"value_type":"HTU21D_humidity","value":"38.48"},{"value_type":"BME280_temperature","value":"22.70"},{"value_type":"BME280_humidity","value":"35.97"},{"value_type":"BME280_pressure","value":"97669.78"},{"value_type":"GPS_lat","value":""},{"value_type":"GPS_lon","value":""},{"value_type":"GPS_height","value":""},{"value_type":"GPS_date","value":"11/14/2017"},{"value_type":"GPS_time","value":"07:04:59.00"},{"value_type":"samples","value Soft WDT reset

ctx: cont sp: 3fff3340 end: 3fff37d0 offset: 01b0

stack>>> 3fff34f0: 3fff7b81 00002838 00000507 3fff1794
3fff3500: 00000010 000002d8 3fff7e34 402251b8
3fff3510: 4025f9e8 00000009 3fff6985 3fff1794
3fff3520: 3ffe8488 00000000 3fff26e4 402251e9
3fff3530: 3ffe8488 00000000 3fff3560 40222209
3fff3540: 3ffe8488 00000000 0000754e 4020766c
3fff3550: 3ffe8488 00000000 0000754e 402077fc
3fff3560: 00000000 00000000 00000000 3fff6944
3fff3570: 0000002f 00000020 3fff7b5c 0000036f
3fff3580: 00000361 0000000a 3fff35d0 40222adb
3fff3590: 3fff26bc 00000507 00000507 40219dd4
3fff35a0: 00000001 00000001 3fff5f84 40224776
3fff35b0: 00000000 0000007e 3fff5f84 40219dca
3fff35c0: 3fff5f84 3fff1a58 3fff5f84 40219e06
3fff35d0: 00000000 00000000 00000000 40222c3c
3fff35e0: 3fff5f84 3fff1a58 3fff1a18 40219e99
3fff35f0: 3fff648c 0000000f 0000000a 4021798c
3fff3600: 3fff1a58 00000000 3fff27b0 00000001
3fff3610: 00000001 40219014 0000000f 40222a8c
3fff3620: 00000000 3fff1784 3fff1a18 3fff1788
3fff3630: 00000001 3fff1a3c 3fff1a18 4021a0d3
3fff3640: 3ffe9f70 00000000 000003e8 40222b5e
3fff3650: 00000000 3fff78c4 3fff3680 40222a30
3fff3660: 3ffeb578 3fff1784 3fff3680 402159c0
3fff3670: 7fffffff 3ffec784 3ffec784 00000001
3fff3680: 00000000 00000000 00000000 4010746a
3fff3690: ffffffff 14525940 00002200 4000050c
3fff36a0: 3fffc278 3fff692c 0000000f 00000000
3fff36b0: 3fff6404 0000000f 00000000 3fff63ec
3fff36c0: 0000000f 00000000 3fff63d4 0000000f
3fff36d0: 00000000 3fff63bc 0000000f 00000000
3fff36e0: 3fff64d4 0000000f 00000000 3fff64bc
3fff36f0: 0000000f 00000000 3fff64a4 0000000f
3fff3700: 00000000 3fff791c 0000000f 00000000
3fff3710: 3fff7904 0000000f 00000000 3fff78ec
3fff3720: 0000000f 00000000 3fff60fc 0000000f
3fff3730: 00000000 3fff60e4 0000000f 00000000
3fff3740: 3fff65bc 0000000f 00000000 3fff65a4
3fff3750: 0000000f 00000000 3fff48f4 0000000f
3fff3760: 00000000 3fff6474 0000000f 00000000
3fff3770: 3fff78ac 0000000f 00000000 3fff4984
3fff3780: 0000000f 00000000 00000000 00000000
3fff3790: 00000000 3fff3740 00000050 feefeffe
3fff37a0: 00000000 00000000 00000001 3fff27a0
3fff37b0: 3fffdad0 00000000 3fff2798 4022385c
3fff37c0: feefeffe feefeffe 3fff27b0 40100718
<<<stack<<< ⸮U)⸮⸮⸮⸮⸮DH⸮⸮mounting FS... mounted file system... reading config file... opened config file... parsed json... Starting Webserver... 0.0.0.0 output debug text to display... 6 Connecting to Sense ......... WiFi connected IP address: 192.168.200.52

ChipId: 11729146 Read SDS... Read DHT... Read HTU21D... Read BME280... Read GPS... Send to custom API... Show on OLED... Trying BME280 sensor on 76 ... found Stopping SDS011... Start reading GPS

ricki-z commented 6 years ago

I've found an ublox gps module with I2C support. Maybe this could solve these issues. Price is 20 Euros at https://www.ehajo.de/bauelemente/sensoren/gps-modul-ublox-pam-7q-0.html. I2C seems to run more stable.

dokape commented 6 years ago

As I understand the problem with the NEO and the serial interface is the sending of data all time. As there is a RX/TX connection, isn't it possible to stop the module sending data until the next measuring, then calling one set of data and stop the transmission again? Then it should be no problem with timing and buffering .... Just my 2 cents...

pathmapper commented 6 years ago

@ricki-z @dokape thanks for these information!

Just ordered a NEO 6M but don't mind to order the ublox as well.

Some questions:

dokape commented 6 years ago

@pathmapper My intense tests have this results: With SDS011, BME280 and Neo6m it runs stable until you don't do the following:

The Data from GPS have to be read continously all time. If this is not done, the buffer fills, the memory runs full, the firmware crashes. If you generate the data.json by calling the site, this data will be generated on-the-fly using memory and CPU-time. This together seems to force a memory-problem with crash/reboot. Not every time, but I'm able to reproduce this by just using the web-gui. If you know, what you are doing and no automatic prozess is connecting to the data.json, then it should work... But I'm not shure at the moment, if the GPS-Position is transferred to your own API. So private logging of the data could only work with calling the json-data, which yould crash the firmware...

pathmapper commented 6 years ago

Thanks @dokape, my use case are mobile measurements with SDS011 + BME280 so I would like to have the GPS position at the time of measurement in the data.

If this is not working with an own api because of calling the json-data: Is the GPS data also transferred to madavi API? Then it would be possible to get the data from there afterwards.

Adorfer commented 6 years ago

"I2C on Neo6M unstable" has been fixed in 2016, it was some kind of "delay-loop" stuff on the nodemcu library (as far as i remember) https://github.com/nodemcu/nodemcu-firmware/issues/1586

ricki-z commented 6 years ago

"I2C unstable" is the NEO8M. The NEO6M doesn't have an I2C interface. This is why we use the SoftSerial connection. And this issue is for the vanilla LUA firmware, not our Arduino implementation. ublox is the manufacturer of alle the mentioned modules NEO6M, NEO-M8N, PAM-7Q (and others).

ricki-z commented 6 years ago

I was able to reduce the serial communication. It's possible to deactivate NMEA messages. Now there are only RMC and GGA messages. This should be 150 Bytes per second. Transmissions to the databases may need some seconds, so I may need to resize the buffer.

dokape commented 6 years ago

NRZ-2017-100-B10

With GPS: loading json-data and value-page at the same time several times together, the firmware still crashes. I can replicate this. Without GPS it doesn't happen.

As this is a special situation, I would suggest to document this as known issue and would not try to solve this within this version.

ricki-z commented 6 years ago

@dokape The GPS support is now marked as experimental in README. Can we close this issue?