Waziup / wazigate-system

GNU General Public License v3.0
1 stars 3 forks source link

python 100% CPU #2

Open cdupont opened 5 years ago

cdupont commented 5 years ago

I often get a python process with 100% CPU.

PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
12 root      20   0  190584  28144   8652 R 100.0  0.2 158:33.47 python

The logs are:

# tail -f post-processing.log
2019-09-18T21:39:31.777914> no timezone support, time will be expressed only in local time
2019-09-18T21:39:31.778025> post status: start running
2019-09-18T21:39:31.957543> get_gps.py: removing /app/logs/gateway_gps.txt file
2019-09-18T21:39:31.957657> get_gps.py: no /app/logs/gateway_gps.txt file
2019-09-18T21:39:31.980709> post status: dynamic GPS is requested
2019-09-18T21:39:31.980822> post status get GPS: use sensors_in_raspi/get_gps.py to get GPS position
2019-09-18T21:39:31.980909> post status get GPS: no GPS coordinates
2019-09-18T21:39:31.980992> post status: show current GPS position
2019-09-18T21:39:31.981075> post status show GPS: current GPS coordinate: gw lat 43.51 long -1.36
2019-09-18T21:39:31.981157> post status: exiting
2019-09-18T21:49:32.224237> 2019-09-18 21:49:32.224053
2019-09-18T21:49:32.224297> post status: gw ON
2019-09-18T21:49:32.224336> post status: executing periodic tasks
2019-09-18T21:49:32.339429> no timezone support, time will be expressed only in local time
2019-09-18T21:49:32.339512> post status: start running
2019-09-18T21:49:32.479005> get_gps.py: removing /app/logs/gateway_gps.txt file
2019-09-18T21:49:32.479079> get_gps.py: no /app/logs/gateway_gps.txt file
2019-09-18T21:49:32.495893> post status: dynamic GPS is requested
2019-09-18T21:49:32.496001> post status get GPS: use sensors_in_raspi/get_gps.py to get GPS position
2019-09-18T21:49:32.496116> post status get GPS: no GPS coordinates
2019-09-18T21:49:32.496197> post status: show current GPS position
2019-09-18T21:49:32.496270> post status show GPS: current GPS coordinate: gw lat 43.51 long -1.36
2019-09-18T21:49:32.496341> post status: exiting

So nothing much. Any idea where is comes from?

cdupont commented 5 years ago

This happens on my laptop, because of the wrong architecture of the LoRa module. I added an environment variable called "NO_LORA" to avoid starting LoRa module. Set "NO_LORA=true" to skip LoRa.

480c1cde355e9204480aacf2f5a63a04dcdeebac

j-forster commented 5 years ago

This error now happend multiple times on the Raspberry Pi Gateway with the LoRa module attached.

/app/data_acq/lora/lora_gateway --mode 1 --freq 865.2 --ndl | python /app/data_acq/lora/post_processing_gw.py | python /app/data_acq/log_gw.py
 * Serving Flask app "api" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: on
mmap: Invalid argument
bcm2835_init: bsc1 mmap failed: Invalid argument
 * Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
 * Restarting with stat
 * Debugger is active!
 * Debugger PIN: 243-344-980
post_processing_gw.py found an alert_conf section
Starting thread to perform periodic gw status/tasks
2019-10-15 06:58:41.842012
post status: gw ON
post status: executing periodic tasks

Current working directory: /app
Failed to map the physical GPIO registers into the virtual memory space.
post status: start running
post status: dynamic GPS is requested
post status get GPS: use sensors_in_raspi/get_gps.py to get GPS position
get_gps.py: removing /app/logs/gateway_gps.txt file
get_gps.py: no /app/logs/gateway_gps.txt file
get_gps.py: invalid serial port
post status get GPS: no GPS coordinates
post status: show current GPS position
post status show GPS: current GPS coordinate: gw lat 43.51 long -1.36
post status: exiting
cdupont commented 5 years ago

So this line causes problems right? /app/data_acq/lora/lora_gateway --mode 1 --freq 865.2 --ndl | python /app/data_acq/lora/post_processing_gw.py | python /app/data_acq/log_gw.py I also spotted that it cause cause 100% CPU if the output of lora_gateway cannot be interpreted correctly by the python scripts.

cdupont commented 5 years ago

The inter process communication could be probably improved. https://www.nginx.com/blog/building-microservices-inter-process-communication/ What do you think? Especially I think we could remove the second pipe: no need to do IPC between two Python programs. They can include/call each others. The first pipe could be replace by some synchronous, point-to-point communication with a simple API.

j-forster commented 5 years ago

Running htop shows that it is the post_processing_gw.y:

CPU%  MEM%  Command
100.  1.9   python /app/data_cq/lora/post_processing_gw.y