kizniche / Mycodo

An environmental monitoring and regulation system
http://kylegabriel.com/projects/
GNU General Public License v3.0
2.89k stars 488 forks source link

High CPU usage on PiZero #386

Closed zsole2 closed 6 years ago

zsole2 commented 6 years ago

Specific Mycodo Version

5.5.9

Problem Description

The CPU usage is very high, as I already mentioned in another issue. Mainly the pigpiod and influxd are responsible. Comparing the system with my old version (v5.0.17), I removed the -s 1 parameter from pigpiod, which reverted its CPU usage from about 20% to 8%. However, I couldn't find any difference in influxd parameters, and it runs above 50%, compared to the 20-25% on the old version.

BTW, the author of pigpiod said that

1MHz DMA sampling is really pushing the limit on the Pi and I would only use it if absolutely necessary.

See (https://raspberrypi.stackexchange.com/questions/75775/how-to-set-sample-rate-for-pigpio-deamon) It certainly is not necessary for me, I don't know about other people's usage, of course.

Additional Notes

I have some other strange behavior, but need to investigate before reporting... e.g. missing PID from the Live page, PID relay times in the negative on the graphs, I suspect something screwy here...

kizniche commented 6 years ago

I'll have to look into exactly why I added the -s 1 parameter, but I recall it was to allow a certain sensor to either work at all or return accurate measurements more often.

The negative PID output is probably because you're using it to lower a condition. All raise conditions will be stored in the database as positive and all lower will be negative, for easier viewing on graphs. For example:

screenshot-192 168 0 10-2018-01-17-16-40-00

kizniche commented 6 years ago

I found my original comment where I discovered using the -s 1 parameter (lower sample time) increased the potential software PWM output frequency.

zsole2 commented 6 years ago

Now I remember seeing that discussion. Can this parameter be changed easily somehow? I don't mind to change manually, but then it needs to be remembered at every upgrade. I have some graphs generated from top, I'll post them later to show how the CPU utilization is startlingly different.

Your output looks really nice with the negative columns! But I have two BUTs: first, if you only lower with the PID, all columns hang from the top, and second, if you have a steady setpoint and not a tracking one, the columns heights ideally tend to be equal: selection_004 selection_005

Oh, these are of very bad quality, look awful, but I on the bus now, sorry. On the other hand, it looks much better now than yesterday when the PID just got started!

kizniche commented 6 years ago

This is how I would change it:

Edit Line 1 of Mycodo/install/mycodo.crontab from:

@reboot /usr/local/bin/pigpiod -s 1 &

to (or can be 1, 2, 4, 5, 8, or 10 microseconds, default is 5):

@reboot /usr/local/bin/pigpiod &

Replace the cron entry with the new:

sudo mycodo-commands update-cron

Then reboot.

Alternatively, you can directly edit the cron entry after an upgrade with:

sudo crontab -e
kizniche commented 6 years ago

I have been trying to figure out a good way to make this value settable, but haven't come up with anything that's not overly-complicated yet.

zsole2 commented 6 years ago

So back to my original issue. Here is some comparison of top output over 1 minute: image

image

Removal of the -s 1 parameter from pigpiod decreased its CPU usage back from above 20%, bu the idle time and influxd time is reversed, a fact that I do not like very much. I admit that there is several major differences between the two Zero's, mainly I suspect as possible factor Jessie vs. Stretch, Python 2 vs 3, and I guess Influxd is also upgraded. What I also noticed that influxd uses more CPU time and less memory, see these cutouts from top, ordered by total time:

top - 08:54:14 up 1 day, 13:20,  1 user,  load average: 1,36, 1,61, 1,80
Tasks:  91 total,   2 running,  89 sleeping,   0 stopped,   0 zombie
%Cpu(s): 38,5 us, 15,6 sy,  0,0 ni, 44,9 id,  0,0 wa,  0,0 hi,  1,0 si,  0,0 st
KiB Mem:    493804 total,   446712 used,    47092 free,    35056 buffers
KiB Swap:   102396 total,    11044 used,    91352 free.   132612 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND                                                                                               
  633 influxdb  20   0 1024100 144972   7984 S 11,6 29,4 462:18.94 /usr/bin/influxd -config /etc/influxdb/influxdb.conf                                                  
  841 root      20   0  442768  47112   4732 S 10,0  9,5 278:39.53 /var/www/mycodo/env/bin/python /var/www/mycodo/mycodo/mycodo_daemon.py                                
  377 root      20   0   11432   1336   1192 S  9,0  0,3 179:36.14 /usr/local/bin/pigpiod                                                                                
  149 root      20   0       0      0      0 R  5,5  0,0 174:11.20 [w1_bus_master1]                                                                                      
 6959 mycodo    20   0  189264  71636  23360 S 11,0 14,5  13:49.56 /usr/sbin/apache2 -k start                                                                            
  112 root      20   0    8104   2716   2580 S  0,0  0,6   7:09.33 /lib/systemd/systemd-journald                                                                         
top - 08:54:06 up 12:53,  1 user,  load average: 1,46, 1,57, 1,50
Tasks:  96 total,   1 running,  95 sleeping,   0 stopped,   0 zombie
%Cpu(s): 91,2 us,  8,1 sy,  0,0 ni,  0,0 id,  0,3 wa,  0,0 hi,  0,3 si,  0,0 st
KiB Mem :   444532 total,    65424 free,   194084 used,   185024 buff/cache
KiB Swap:   524284 total,   521892 free,     2392 used.   177764 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND                                                                                               
  236 influxdb  20   0  820540  69728   9120 S 80,6 15,7 369:29.26 /usr/bin/influxd -config /etc/influxdb/influxdb.conf                                                  
  739 root      20   0  418712  66928   8848 S  8,1 15,1 124:35.44 /var/mycodo-root/env/bin/python /var/mycodo-root/mycodo/mycodo_daemon.py                              
  276 root      20   0   11424   1796   1540 S  6,9  0,4  57:57.66 /usr/local/bin/pigpiod                                                                                
  162 root      20   0       0      0      0 S  0,0  0,0  11:54.80 [w1_bus_master1]                                                                                      
  649 root      20   0   70404  45016   7496 S  0,3 10,1   6:50.83 /home/pi/Mycodo/env/bin/python3.5 /var/mycodo-root/env/bin/gunicorn --workers 1 --worker-class gthre+ 
  108 root      20   0   28784   7116   6888 S  0,0  1,6   2:42.70 /lib/systemd/systemd-journald                                                                         
  489 root      20   0   21900  17304   7104 S  0,0  3,9   0:44.38 /home/pi/Mycodo/env/bin/python3.5 /var/mycodo-root/env/bin/gunicorn --workers 1 --worker-class gthre+ 
  568 www-data  20   0   47400   5616   4156 S  0,0  1,3   0:34.71 nginx: worker process                                                                                 
   68 root      20   0       0      0      0 S  0,0  0,0   0:29.58 [mmcqd/0]                                                                                             
  242 root      20   0   22848   3112   2296 S  0,0  0,7   0:29.23 /usr/sbin/rsyslogd -n                                                                                 
kizniche commented 6 years ago

Perhaps the increased load of influxdb is due to an increased quantity of measurements to process?

zsole2 commented 6 years ago

I was trying to avoid this bias. The old version is on a production machine, and I did not want to tinker too much with it, but added the missing rPi sensors, free space, etc. It has a HTU21D sensor, which is replaced on the test system with a DHT22, and it has an additional DS18B20. So it is approximately 1 sensor addition. All in all, the test system had 8 redings at the default 15s intervals, while the production system had 7 when I recorded the graphs above. I realized that the decreasing data frequency could help, this resulted in #393. Whatever period I was able to change, I did, and it indeed improved the situation, and I'll change the 2 other sensors to 30s sampling frequency. Right now I reinstall Mycodo on a new SD card on a different PiZero, since I still have freezing issues, and want to further investigate. I'll test the CPU issue further on this new system.

zsole2 commented 6 years ago

I little report on this. After setting the sampling frequencies to at least 30s everywhere, the system is more responsive now. The CPU 1min load is hovering around 2. It seems OK now, in some issue I've seen your system around 0.5-0.7, and I k now it is not a Zero. a factor of 4 is realistic. I was able to confirm that since I had to put my Mycodo SD card into a Pi3 (investigating connection issues, therefore needed the wired access) and was really surprised how much faster is everything. The factor of 4x slowness of the Zero seemed to OK, but did not take any proper measurement, just a quick look at the load values.
So for using the Zero has advantages (lower power, and for me especially the much smaller size), but no one should expect it to be snappy.

To summarize before closing, if later this is revisited

Additional possibilities

kizniche commented 6 years ago

You'll be happy to hear I developed an easy way to switch between different pigpiod sample rates from the UI. I added a new settings section that this fits into.

screenshot-192 168 0 10-2018-02-10-17-25-36

kizniche commented 6 years ago

I saw you suggested the option of completely disabling pigpiod so I added it since the framework was there to easily add that option. This feature has been reliably working for me, I can switch back and forth between disabled, 1 ms, and 5 ms by just selecting the option. It restarts the daemon in the process so that queries don't fail while pigpiod is being shut down and restarted with another sample rate.

zsole2 commented 6 years ago

That looks really nice and useful! It almost makes the raspi-config part of the setup unnecessary, since the interface options are accessible within Mycodo, the filesystem is automatically expanded on the first start, therefore only the password change and localization options remain to be set. If you let users to change the hostname, a check of valid name ('a-z', '0-9','-') could be useful in the code, if it is not there.

zsole2 commented 6 years ago

Checking into raspi-config, the remaining options seem quite easy to implement from the mycodo setup script with simple bash commands:

dpkg-reconfigure locales
dpkg-reconfigure tzdata
sudo raspi-config nonint do_wifi_country %s
(echo \"%s\" ; echo \"%s\" ; echo \"%s\") | passwd
kizniche commented 6 years ago

Here are all the nonint options for raspi-config:

https://github.com/kizniche/Mycodo/blob/bfa9a8c4f3c5217ecd0df2665b3daa45d9c0d0d1/mycodo/mycodo_flask/utils/utils_settings.py#L271

zsole2 commented 6 years ago

I saw you suggested the option of completely disabling pigpiod so I added it since the framework was there to easily add that option. This feature has been reliably working for me, I can switch back and forth between disabled, 1 ms, and 5 ms by just selecting the option. It restarts the daemon in the process so that queries don't fail while pigpiod is being shut down and restarted with another sample rate.

Is it possible to save the setting for pigpiod across upgrades? I forgot today to set it, and my system became immediately less responsive.

kizniche commented 6 years ago

It should save it across upgrades

kizniche commented 6 years ago

Actually, it may save the ms setting across upgrades, but if it's off it may set it back to 1 ms. I'll look into it. In any case, that will be an easy fix.

kizniche commented 6 years ago

You're correct that I forgot to account for the disabled option persisting after an upgrade. I'll have a fix pushed soon.

kizniche commented 6 years ago

I just added the ability to set the sample rate of Conditional, Input, Math, Output, and PID Controllers. This should give you some options to experiment with to change the CPU load caused of Mycodo. Remember to restart the daemon after saving the settings.

kizniche commented 6 years ago

Once the next release is made, you can find the new settings under Config/Pi Settings

zsole2 commented 6 years ago

I follow the new release with great interest, but now I am quite busy with another project so my fridges are a little less in focus. Still testing the DS sensors for long term stability and then rebuild one of the fridges with the new Mycodo for testing all what I have in mind for improvements, e.g. adding wet bulb and other stuff.