letscontrolit / ESPEasy

Easy MultiSensor device based on ESP8266/ESP32
http://www.espeasy.com
Other
3.22k stars 2.2k forks source link

Problems with the _P053_PMSx003 Dust Plugin #914

Closed micropet closed 10 months ago

micropet commented 6 years ago

Hello everybody, I have some devices with the dust sensor PMS7003 and some with the SDS021.

The devices with the SDS021 run smoothly for weeks. Those with the PMS7003 only make a few measurements and then hang. The dust levels are no longer updated.

After a reset or disconnection of the voltage, the values are updated only a couple of times, then nothing comes up.

The rest of the measurements, such as BME280 BH1750, continue.

device_4 device_page_1

TD-er commented 6 years ago

I don't know the PMSx003 sensor. Do you need to write to it to get the data? (GPIO => RX) If so, then you'll probably want to try other pins for this device. Does the other setup, with the SDS0x1 sensor also have another serial device like the CO2 sensor? Perhaps using 2 serial devices can cause issues?

The SDS011 I use has an uptime of 52 days and is running fine (in freezing temperatures) so that one is indeed stable :)


Verzonden vanaf laptop

On 21 February 2018 at 09:46, micropet notifications@github.com wrote:

Hello everybody, I have some devices with the dust sensor PMS7003 and some with the SDS021.

The devices with the SDS021 run smoothly for weeks. Those with the PMS7003 only make a few measurements and then hang. The dust levels are no longer updated.

After a reset or disconnection of the voltage, the values are updated only a couple of times, then nothing comes up.

The rest of the measurements, such as BME280 BH1750, continue.

[image: device_4] https://user-images.githubusercontent.com/2838584/36470671-fe9aa8e0-16eb-11e8-92fe-9feddcc3d5f4.png [image: device_page_1] https://user-images.githubusercontent.com/2838584/36470672-feb4900c-16eb-11e8-95e5-79026b48f820.png

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/letscontrolit/ESPEasy/issues/914, or mute the thread https://github.com/notifications/unsubscribe-auth/ADk9ljlPVUnmHKcrYQ7iCgp-7xXzxE2yks5tW9fogaJpZM4SNQbs .

micropet commented 6 years ago

The device with the SDS also has a MH-Z19 CO2 sensor.

I think the PMS sends the data by itself.

The connections I would not like to change.

device_4 device_page_1

TD-er commented 6 years ago

Have you also tried another power supply and other cable to power the node? These sensors may peak in current use and it may be just enough to let the voltage drop too much when both the CO2 and the dust sensor peak at the same time. A lot of USB cables often have quite some resistance, which will result in voltage drop at higher currents.

micropet commented 6 years ago

Yes, I know that with the cables. There are a lot of high-impedance cables on the market. I had the power supply and the cable changed this morning.

I have set paralell to the 5V a 2800μF capacitor. This has been enough for all devices so far.

The PMS has then sent his values twice and is frozen again.

But it must be due to the software, with Tasmota it runs perfectly.

TD-er commented 6 years ago

I've looked into the source and it looks very likely to give exactly the issues you mention. I will fix it, although I have not (yet) such a sensor to test myself.

The problem with this implementation is that the reader (ESPboard) can get out of sync with the writer (sensor) and can't get into sync anymore (not within a few hours, perhaps never) Apart from that, the implementation is made overly complicated to support the hardware serial, which is a great idea, but at the wrong place.

micropet commented 6 years ago

The PMS is already connected via software serial.

What does your answer mean? Cant we do anything? Then you should remove the plugin.

I will buy a SDS021 that works in other devices. And for the PMS I will use Tasmota.

TD-er commented 6 years ago

What I meant is, I will try to fix it in ESPeasy. All plugins in the releases should work and if not, we should make them work. Simple as that. So thanks for your report, I will have a look at it.

micropet commented 6 years ago

That is a word!

uzi18 commented 6 years ago

@TD-er I have got PMS sensor, maybe can help here Where do You think problem is?

TD-er commented 6 years ago

@uzi18 You will get a version to test later this week.

Do you also experience similar issues?

uzi18 commented 6 years ago

My sensor were never used before but this is a time ;)

uzi18 commented 6 years ago

It is PMS3003

TD-er commented 6 years ago

See Pull Request #976 @micropet and/or @uzi18 Can you test it to see if it is now more stable. I don't have such a sensor yet, so I cannot test it.

micropet commented 6 years ago

Good morning Gijs, you are awake early!

The plugin has already failed after two measurements. Now it has been running for 15 minutes with some Errors in the Logfile.

504669: PMSx003: invalid framelength - 142 506220: MHZ19: Raw PPM: 905 Filtered PPM value: 1043 Temp / S / U values: 23/0 / 0.00 506224: EVENT: MH-Z19 # PPM = 1093.00 506366: ACT: NeoPixelAll, 50,70,0,0 506492: EVENT: MH-Z19 # MHZ19Temp = 22.20 506721: EVENT: MH-Z19 # U = 0.00 515541: WD: Uptime 9 ConnectFailures 0 FreeMem 13104 528697: EVENT: PMS7003 # pm1.0 = 5.00 528971: EVENT: PMS7003 # pm2.5 = 8.00 529217: EVENT: PMS7003 # pm10 = 14.00 .... 564627: PMSx003: invalid framelength - 32772 567203: MHZ19: Raw PPM: 917 Filtered PPM value: 1053 Temp / S / U values: 23/0 / 0.00 567207: EVENT: MH-Z19 # PPM = 1103.00 567348: ACT: NeoPixelAll, 50,70,0,0 567444: EVENT: MH-Z19 # MHZ19Temp = 22.20 567672: EVENT: MH-Z19 # U = 0.00 575544: WD: Uptime 10 ConnectFailures 0 FreeMem 13088 589198: EVENT: PMS7003 # pm1.0 = 3.00 589444: EVENT: PMS7003 # pm2.5 = 7.00 589717: EVENT: PMS7003 # pm10 = 8.00 597648: MHZ19: Raw PPM: 920 Filtered PPM value: 1058 Temp / S / U values: 23/0 / 0.00 597652: EVENT: MH-Z19 # PPM = 1108.00 ...... 742633: EVENT: PMS7003 # pm1.0 = 3.00 742881: EVENT: PMS7003 # pm2.5 = 8.00 743125: EVENT: PMS7003 # pm10 = 10.00 743662: EVENT: Clock # Time = Wed, 06: 04 748652: PMSx003: invalid framelength - 142

edit: Looks good. Runs for 30 minutes.

TD-er commented 6 years ago

Those errors are not that bad, as long as it can keep sync. Or at least regain sync. And I was nerding late, it is now 8am here.

micropet commented 6 years ago

If it has just failed, no values are displayed anymore. It ran for over three hours.

There is no entry from the dust sensor in the logfile.

uzi18 commented 6 years ago

It is almost perfect now with @TD-er fix. My idea is to not do anything unnecessary, it is only related to overhead.

micropet commented 6 years ago

It only runs for a few hours. That's not enough.

uzi18 commented 6 years ago

Code is better it must be ok. I think you have got some connection problems sometimes, now it will auto sync.

micropet commented 6 years ago

I have no connection problems. The device ran for weeks under Tasmota without failing once.

TD-er commented 6 years ago

@micropet Can you show the last output of the sensor (log output)? And also try to see if that specific ESP module is showing some accurate uptime (disable NTP for this test) I will add the extra check on the dummy parameter. Maybe you have some valid value in the stream that has the same byte value as the first byte in the sequence and then you'll never get a valid output anymore. So test the 'dummy' value as suggested and then stop reading if it is not matching the start sequence.

micropet commented 6 years ago

Is that correct that the log file is only so short? Or is there a "full" version of it?

TD-er commented 6 years ago

You can try to send it to a logserver. But in one of the releases someone reported issues with that. Still have to look into that.

micropet commented 6 years ago

OK, then I'm waiting for the version with the extra check on the dummy parameter.

TD-er commented 6 years ago

Did you check with this morning's build?

micropet commented 6 years ago

Yes I have. I download the source as soon as I discover something new. It still does not work. It runs for a few minutes and then it stops.

I had platformio installed yesterday and flashed with it. But no difference.

I think we have to wait until you get your PMS.

Peter

TD-er commented 6 years ago

That may take some time...

Departed country of origin 2018-03-02 23:30:00 [GMT+8]

micropet commented 6 years ago

I order every few days something in China and get every 2 to 3 days a small letter. It is always a surprise to open the letter. :)

Time passes faster than you think.

m-anish commented 6 years ago

Any updates on this?

micropet commented 6 years ago

No, it still does not work. The measurement runs for a few minutes, then freezes.

TD-er commented 6 years ago

Last weekend I found an issue which affects "realtime" performance needed for processing this data stream. I hope issues like these will also be fixed when that issue is resolved. I am a little short on time today, but my hope is that I have time to fix that issue Wednesday or Thursday evening and then we can also test this one.

micropet commented 6 years ago

Yes, great Gijs,

I still have some devices with the PMS, they are currently sleeping.

m-anish commented 6 years ago

Any updates on this?

TD-er commented 6 years ago

@m-anish Please read 2 comments above.

X3sar commented 6 years ago

Hi, i'm using PMS sensor, but i'm having the same problem I dont know if the sensor is bad or somehitng in my code is wrong some times i get reads but suddenly, it stops sending data.

I'm working with de lubrary node-plantower, but i'm having the same problem.

help!!!! please.

micropet commented 6 years ago

The Sensos is definitely fine. I have the same problem for months.

TD-er wanted to take care of it

uzi18 commented 6 years ago

@TD-er please provide more info about issue you found, thx

TD-er commented 6 years ago

@uzi18 Last week I had very little time to work on this, so more delay than I hoped for. What I found was this: There were some parts in the code using way too much time for themselves. This resulted in the 50/sec and 10/sec tasks to get way less runs than expected. Sometimes routines even needed > 1 second to finish, which results in 0 calls to those time critical routines. This can lead to simply missing bursts of data, which is typical for this sensor.

Second issue is that the software-serial appears to have some issues handling interrupts in due time. (see discussion on forum regarding Nextion display) This last part should not really be an issue. Just wait for the next burst of data and you're back in sync again. But of the 1/50th and 1/10th calls are not processed, it is very difficult to get in sync again. Also the burst of this sensor is close to the used buffer size, which makes it near impossible to find the sync again.

I have already fixed a number of routines taking a lot of time. You can also run the last build to see the time-statistics logging (each 30 second, only visible in the serial log) to see what function takes how much time. There should be no function using > 20 msec per run. Occasional is not an issue, but average is bad. Currently there is no logging yet on the controllers, but my guess is there are some taking way more time than desired. So I will add those as well.

Yet to do: Have a second look at the routine to regain a sync on the data, but now taking into account there may be some bitflips due to the mentioned interrupt issues on software serial.

micropet commented 6 years ago

23117235 : LoopStats: shortestLoop: 46 longestLoop: 1058298 avgLoopDuration: 88.41 systemTimerDuration: 25.19 systemTimerCalls: 31 loopCounterMax: 652173 loopCounterLast: 330765 countFindPluginId: 0

On a wemos without hardware.

TD-er commented 6 years ago

Please increase your loglevel a bit. There should be more time statistics.

micropet commented 6 years ago

OK. Here is a part with a PCA9685. Clock # Time takes 22 ms and the rules 200 ms.

70712866 : PluginStats P_22_Extra IO - PCA9685 WRITE                Count: 10 Avg/min/max 441.50/363/796 usec
70712876 : PluginStats P_22_Extra IO - PCA9685 FIFTY_PER_SECOND     Count: 1491 Avg/min/max 15.30/10/43 usec
70712885 : Plugin call 50 p/s   stats: Count: 1491 Avg/min/max 587.23/508/913 usec
70712893 : Plugin call 10 p/s   stats: Count: 300 Avg/min/max 559.54/500/879 usec
70712900 : Plugin call 10 p/s U stats: Count: 300 Avg/min/max 3565.16/3297/4033 usec
70712907 : Plugin call  1 p/s   stats: Count: 30 Avg/min/max 9431.93/1659/202361 usec
70712915 : checkSensors()       stats: Count: 30 Avg/min/max 840.90/602/2236 usec
70712922 : WD   : Uptime 1178 ConnectFailures 2 FreeMem 13424
70742923 : LoopStats: shortestLoop: 46 longestLoop: 1058298 avgLoopDuration: 89.73 systemTimerDuration: 25.16 systemTimerCalls: 31 loopCounterMax: 652173 loopCounterLast: 326870 countFindPluginId: 0
70742930 : PluginStats P_22_Extra IO - PCA9685 ONCE_A_SECOND        Count: 30 Avg/min/max 16.47/14/30 usec
70742939 : PluginStats P_22_Extra IO - PCA9685 TEN_PER_SECOND       Count: 301 Avg/min/max 11.07/3/34 usec
70742948 : PluginStats P_22_Extra IO - PCA9685 FIFTY_PER_SECOND     Count: 1500 Avg/min/max 15.90/10/46 usec
70742957 : Plugin call 50 p/s   stats: Count: 1500 Avg/min/max 592.04/504/903 usec
70742965 : Plugin call 10 p/s   stats: Count: 301 Avg/min/max 524.72/500/840 usec
70742972 : Plugin call 10 p/s U stats: Count: 301 Avg/min/max 3570.82/3275/4024 usec
70742979 : Plugin call  1 p/s   stats: Count: 30 Avg/min/max 1848.23/1778/2090 usec
70742987 : checkSensors()       stats: Count: 30 Avg/min/max 667.87/645/783 usec
70742994 : WD   : Uptime 1179 ConnectFailures 2 FreeMem 13424
70751684 : EVENT: Clock#Time=Mon,07:37
70751705 : EVENT: Clock#Time=Mon,07:37 Processing time:22 milliSeconds
70768683 : EVENT: Rules#Timer=1
70768696 : ACT  : Publish ESP-218/IP,192.168.0.218
70768708 : Command: publish
70768714 : ACT  : Publish ESP-218/MAC,68:C6:3A:A5:FB:B5
70768727 : Command: publish
70768733 : ACT  : Publish ESP-218/Time,07:37:17
70768745 : Command: publish
70768751 : ACT  : Publish ESP-218/Uptime,1179
70768763 : Command: publish
70768770 : ACT  : Publish ESP-218/RSSI,-71
70768781 : Command: publish
70768787 : ACT  : Publish ESP-218/SSID,SMC
70768799 : Command: publish
70768806 : ACT  : Publish ESP-218/BSSID,78:8A:20:D1:9B:D9
70768818 : Command: publish
70768824 : ACT  : Publish ESP-218/CH,1
70768836 : Command: publish
70768842 : ACT  : Publish ESP-218/SYSHEAP,11656
70768855 : Command: publish
70768859 : ACT  : timerSet,1,60
70768871 : Command: timerset
70768884 : EVENT: Rules#Timer=1 Processing time:200 milliSeconds
70772995 : LoopStats: shortestLoop: 46 longestLoop: 1058298 avgLoopDuration: 90.23 systemTimerDuration: 25.35 systemTimerCalls: 31 loopCounterMax: 652173 loopCounterLast: 324928 countFindPluginId: 0
70773002 : PluginStats P_22_Extra IO - PCA9685 READ                 Count: 1 Avg/min/max 15.00/15/15 usec
70773011 : PluginStats P_22_Extra IO - PCA9685 ONCE_A_SECOND        Count: 30 Avg/min/max 15.87/11/27 usec
70773020 : PluginStats P_22_Extra IO - PCA9685 TEN_PER_SECOND       Count: 300 Avg/min/max 12.05/3/38 usec
70773029 : PluginStats P_22_Extra IO - PCA9685 WRITE                Count: 10 Avg/min/max 453.00/387/796 usec
70773039 : PluginStats P_22_Extra IO - PCA9685 FIFTY_PER_SECOND     Count: 1489 Avg/min/max 15.60/10/46 usec
70773048 : Plugin call 50 p/s   stats: Count: 1489 Avg/min/max 587.28/504/922 usec
70773056 : Plugin call 10 p/s   stats: Count: 300 Avg/min/max 538.97/500/857 usec
70773063 : Plugin call 10 p/s U stats: Count: 300 Avg/min/max 3551.89/3204/3978 usec
70773070 : Plugin call  1 p/s   stats: Count: 30 Avg/min/max 9547.20/1782/203298 usec
70773078 : checkSensors()       stats: Count: 30 Avg/min/max 882.50/637/2346 usec
70773085 : WD   : Uptime 1179 ConnectFailures 2 FreeMem 13424
70803086 : LoopStats: shortestLoop: 46 longestLoop: 1058298 avgLoopDuration: 89.71 systemTimerDuration: 25.16 systemTimerCalls: 31 loopCounterMax: 652173 loopCounterLast: 326935 countFindPluginId: 0
70803093 : PluginStats P_22_Extra IO - PCA9685 ONCE_A_SECOND        Count: 30 Avg/min/max 11.53/10/23 usec
70803102 : PluginStats P_22_Extra IO - PCA9685 TEN_PER_SECOND       Count: 301 Avg/min/max 14.98/3/39 usec
70803111 : PluginStats P_22_Extra IO - PCA9685 FIFTY_PER_SECOND     Count: 1500 Avg/min/max 15.33/10/39 usec
70803120 : Plugin call 50 p/s   stats: Count: 1500 Avg/min/max 586.89/504/879 usec
70803128 : Plugin call 10 p/s   stats: Count: 301 Avg/min/max 572.03/499/867 usec
70803135 : Plugin call 10 p/s U stats: Count: 301 Avg/min/max 3564.61/3261/3997 usec
70803142 : Plugin call  1 p/s   stats: Count: 30 Avg/min/max 1728.70/1655/1936 usec
70803150 : checkSensors()       stats: Count: 30 Avg/min/max 617.63/602/717 usec
70803157 : WD   : Uptime 1180 ConnectFailures 2 FreeMem 13424
70811683 : EVENT: Clock#Time=Mon,07:38
70811705 : EVENT: Clock#Time=Mon,07:38 Processing time:22 milliSeconds
TD-er commented 6 years ago

Maybe I should open another issue for the timing issues. Just to get things a bit organized.

micropet commented 6 years ago

Here with PMS7003 via Syslog.

device_page_1

ESP-206 with PMS7003.zip

micropet commented 5 years ago

I had hoped that the many changes in the past time the PMSx003 plugin could be positively affected.

But it still does not work. After booting, a value is read once or twice, then it does not work anymore.

TD-er commented 5 years ago

@micropet Just to give you a heads-up. I have been thinking about this issue quite a bit and this evening I started to make a "braindump" of this idea I have to fix this.

See my initial commit: https://github.com/TD-er/ESPEasy/commit/5309396bd0eb493ebe0a27fdabb7b47d857355b7

It is not even compiled yet, just a dump of my ideas, so it is still pre-pre-alpha state :)

In short, this plugin will be more than halved in the end, regarding the number of lines. Also it will only call proces() in the 50/sec call and when a packet is available, it will start processing it. Those are all non-blocking and it will never loose track of finding the packet start, since every attempt to find the packet will look into the buffer to see if there is a full packet. The whole packet can be inspected for this, not only a peek at the next byte.

Also, it will implement a double-buffer system. One in the software serial and one in this reader.

If this will not solve the issue, then we are having a hardware issue, which means the bits are damaged between sensor and ESP.

micropet commented 5 years ago

@ TD-er that sounds good and complicated. :) If you have something to try say it. Although I am 100 km away from home, I have RDP access to my computers.

I do not believe in a hardware problem. I tried it on several devices. Every morning, if there is a new version I test it briefly.

If I flash the original examples of the PMSx003, then it runs perfectly for several weeks.

TD-er commented 5 years ago

That's a good reason why it should be considered a software issue :)

Tonight and tomorrow I don't have much time, but I will continue on that serial reader, since I am convinced it is the way to serialize (pun intended) the jobs in a non-blocking way and leave the hard part in a single set of code and make plugins much simpler.

ShardanX commented 5 years ago

If rewriting the PMSx003 plugin that much, how's about considering #487 ?

A plugin that reduces the lifetime of the PMS to less then one year does not make much sense.

Due to datasheet the PMSx003 have a lifetime of around 8000h, so permanently driven it will run for 333 days. The laser diode has a limited lifetime so after this period the values will get unreliable.

I've made a crude ruleset to pull it offline and just let it run every 30 seconds. This should be implemented within the plugin. The sensor can be set to sleep by pulling the "SET" pin to low. I suggest to use the "Delay" field and process some steps:

ShardanX commented 5 years ago

Maybe it should be mentioned: Some changes seem to make the plugin more unstable. I've two sensors with PMS7003 running on v2.0-20180209.

dustsensor

The only issue I see is that they reboot every some days. Not nice, but doesn't affect me much as they return to work silently. Values are read permanently evetry 30 sec. without stopping:

dustsensor_reading

TD-er commented 5 years ago

Can you see the last reboot reason? I added a more descriptive text in the webinterface a few weeks ago.