robotastic / trunk-recorder

Records calls from a Trunked Radio System (P25 & SmartNet)
GNU General Public License v3.0
868 stars 195 forks source link

High CPU - Parent process #207

Closed FarvaTechnology closed 4 years ago

FarvaTechnology commented 5 years ago

ISSUE SUPER HIGH CPU consumption on parent process. 1) 200+% on quad core processor. In other words, 50% of CPU. 2) Only happens on parent process. 3) With my config, there are about 300+ child processes. 4) Issue has been tracked down to: main.cc usleep(1000 * 10);

ENV: Ubunti 18.05 Linux 4.15.0-43-generic #46-Ubuntu SMP Thu Dec 6 14:45:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux Dell T30 Quad Core Xeon E3-1225 (default bios) 7 SDR Radio cards.

CONFIG 7 SDR/RTL Cards w/ driver: osmosdr Signal for all frequencies is excellent. Only 6 digital recorders per instance. So it should be 42 recorders 7 * 6 = 42. Total talk group entries: About 40-70 per agency. 4 agencies (towers) total - But only 2 geo areas. AKA 2 counties

So here is the deal: Whenever I fire up my p25... It SKY ROCKETS CPU use - Even when there is no radio traffic.

Performing an strace showed me that usleep was causing a nanosleep. So I replaced usleep with: boost::this_thread::sleep( boost::posix_time::milliseconds(100); - But didn't see any improvement.

stracing the processes clearing shows a TON of activity relates to sleep.

All child threads show 0.0 - 2.0 CPU use (on quad core).

QUESTION Why is parent processes killing CPU?

How do we fix this?

Can this wait/sleep interval be increased? What is the purpose of this wait?

robotastic commented 5 years ago

First off - that is an impressive setup! well done!

Do you know if a recent change triggered this? I am running 16.04 with 5 rtl-sdrs and haven't had problems.

Background: The main process has a loop. Basically, it checks if there is a new trunk message that has come in. If not, it goes to sleep for a sec. There are probably lots of Sleep calls, but it shouldn't be a lot of CPU usage from it. It could show up as CPU time though.

To get a better sense of CPU usage, try: top -H -p PID-of-RECORDER

FarvaTechnology commented 5 years ago

During research... There may be kernel concerns and nanosleep. Something to do with high precision timers. :(

This is a NEW build... Moved from lubuntu on 2ghz laptop Celeron single core TO ubuntu 18.x 3ghz quad core Xeon.

Stange part is... It's only parent process.

Try running htop and see what your parent pid is showing for CPU.

I am curious if I must downgrade my OS.

On Tue, Jan 1, 2019, 9:01 AM Luke Berndt <notifications@github.com wrote:

First off - that is an impressive setup! well done!

Do you know if a recent change triggered this? I am running 16.04 with 5 rtl-sdrs and haven't had problems.

Background: The main process has a loop. Basically, it checks if there is a new trunk message that has come in. If not, it goes to sleep for a sec. There are probably lots of Sleep calls, but it shouldn't be a lot of CPU usage from it. It could show up as CPU time though.

To get a better sense of CPU usage, try: top -H -p PID-of-RECORDER

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/robotastic/trunk-recorder/issues/207#issuecomment-450730679, or mute the thread https://github.com/notifications/unsubscribe-auth/Ap-QsUB3rfFCytXsuAt6e5bpBcDFSURBks5u-2pXgaJpZM4Zl4D7 .

FarvaTechnology commented 5 years ago

All,

I ran through several different variations of sleep. usleep, nansleep, boost sleep methods and the like.

The below sleep statement has helped me recover nearly 30% of the CPU that was being consumed. Because with other sleep/wait methods the CPU was running around 200+% on a quad core.

This helped me realize about 170-180%CPU on a quad.

  std::this_thread::sleep_for(std::chrono::nanoseconds(10));//10 milli = 10,000 micro

The above fix may NOT be helpful for single core processors - But in a multi core ENV - It has helped me.

I am still running around 170% CPU on the parent process. But from what I have seen, it appears that approximatley 20% of the CPU happens for EACH bank of frequencies I have in my json config file (aka each SDR card).

Since I am running 7 SDR cards at once - This makes since that 20-25% CPU for each card or bank of recorders would add up to around 180%CPU.

I am curious though - I have read that the Xeon processor I am using has some speciality stuff within the BIOS for processing encryption and stuff like that. It even has failsafe mechanisms to protect data for things like banking applications that wouldn't be found in a consumer grade computer processor.

I am VERY curious if turning off some of those settings in the BIOS will help me realize some realistic numbers for CPU consumption.

If anyone has any input on how to get this reduced - Please let me know.

CPU Type is Xeon E3-1225

FarvaTechnology commented 5 years ago

UPDATE: Spoke to Dell engineer today - They "officially" support Ubuntu 16.04 64 bit for the E3-1225 T30 server. I broken down my old server, did a fresh install.. And excessive CPU usage still exists.

With that said - I don't believe this is a kernel issue. But I am still not sure.

I will review some of the BIOS settings - Because I believe this processor has special settings for encoding, encryption cracking and banking data verification. Where I believe any of these "special" processor features could be causing excessive CPU.

If anyone has input that may help - Please update ticket so I can explore those paths to success.

robotastic commented 5 years ago

Where you able to try running ‘ps -H -p (PID of recorder)’? That will tell you what process is using up the CPU cycles. I think strace maybe misleading in this case. Could you try reducing the number of RTLSDRs? is there a dramatic change at a certain number or does it decline linearly?

On Jan 3, 2019, at 4:03 AM, FarvaTechnology notifications@github.com wrote:

UPDATE: Spoke to Dell engineer today - They "officially" support Ubuntu 16.04 64 bit for the E3-1225 T30 server. I broken down my old server, did a fresh install.. And excessive CPU usage still exists.

With that said - I don't believe this is a kernel issue. But I am still not sure.

I will review some of the BIOS settings - Because I believe this processor has special settings for encoding, encryption cracking and banking data verification. Where I believe any of these "special" processor features could be causing excessive CPU.

If anyone has input that may help - Please update ticket so I can explore those paths to success.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/robotastic/trunk-recorder/issues/207#issuecomment-451086180, or mute the thread https://github.com/notifications/unsubscribe-auth/AAG53HD6AeOZWj8WxsQ2Bo2cGDDBtvClks5u_cdHgaJpZM4Zl4D7.

kcwebby commented 5 years ago

I run 8 SDR's in all of my recorders (five) .. I only have high usage on one of them, and its the most hectic site that I have. I don't think its related to the RTL-SDRs..... I'm following this because I do fight CPU usage from time to time on that machine.

On Thu, Jan 3, 2019 at 6:26 AM Luke Berndt notifications@github.com wrote:

Where you able to try running ‘ps -H -p (PID of recorder)’? That will tell you what process is using up the CPU cycles. I think strace maybe misleading in this case. Could you try reducing the number of RTLSDRs? is there a dramatic change at a certain number or does it decline linearly?

On Jan 3, 2019, at 4:03 AM, FarvaTechnology notifications@github.com wrote:

UPDATE: Spoke to Dell engineer today - They "officially" support Ubuntu 16.04 64 bit for the E3-1225 T30 server. I broken down my old server, did a fresh install.. And excessive CPU usage still exists.

With that said - I don't believe this is a kernel issue. But I am still not sure.

I will review some of the BIOS settings - Because I believe this processor has special settings for encoding, encryption cracking and banking data verification. Where I believe any of these "special" processor features could be causing excessive CPU.

If anyone has input that may help - Please update ticket so I can explore those paths to success.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/robotastic/trunk-recorder/issues/207#issuecomment-451086180>, or mute the thread < https://github.com/notifications/unsubscribe-auth/AAG53HD6AeOZWj8WxsQ2Bo2cGDDBtvClks5u_cdHgaJpZM4Zl4D7 .

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/robotastic/trunk-recorder/issues/207#issuecomment-451129406, or mute the thread https://github.com/notifications/unsubscribe-auth/AKTZAjwF1hOVnrtF7lY9dGdAlyHCN3Tuks5u_fb-gaJpZM4Zl4D7 .

FarvaTechnology commented 5 years ago

Here is everything I have so far - If anyone has any input - Please let me know:

Overview I am seeing between 170-200% CPU use on the e3-1225 quad core processor. I am not sure if this is normal.

Issue appears to be related to main.cc->monitor_messages() and usleep(10,000 microseconds). There is a while loop there with a rapid loop with 10 millisecond waits when no message is present.

I am not sure if this wait / sleep can be increased... A constant loop with 10 millisecond sleeps seams aggressive to me... But smarter folks than myself have created this code. :)

See the bottom of this message for screen shot differences of using usleep and std::thread. For me - I realized a reduction from 227%CPU to 173% CPU by switching from usleep(10,000) to: std::this_thread::sleep_for(std::chrono::nanoseconds(10));//10 milli = 10,000 micro

BUT - 173% CPU is still high for me. Each core is rocking out 40-50% CPU for a 7 card system that is monitoring 3 rural p25 sites.

My settings My recorder config has 7 active cards. No errors, everything configured and working correctly.

No error rates from rtl_test.

Reception area is GREAT with roof 800mhz antenna - So there shouldn't be any issues with retuning failures or the like.

My config file has 7 sections for frequencies and monitors 3 p25 systems.

From what I can tell - Firing up recorder shows CPU use of about 15-20% per instance of a "section" within the config.

So... If each section aka card config setting consumes about 20% CPU. Then it's realistic that 7 cards X 20% would equal 140-200% CPU that I am seeing.

My total number of recorders per section is: 6 recorders per section 6 X 7 = 42 total recorders

My system is Quad core E3-1225 Xeon Processor 8gb of memory

Operating systems Tried 18.04 - Had issues. Currently using 16.04 since that is the "official" supported OS by Dell for this server (T30) - Same issues.

strace strace of spawned (child) recorders are showing different activity than the parent PID. The parent PID is show TONS of sleeps - That is coming form main.cc usleep code located inside monitor_messages - while loop else block.

Analog or Digital? All systems are p25. Code related to this would be: p25_parser->parse_message(msg) - Near the else statement.

What is being monitored Police/Fire and other traffic across two counties is being processed. Total talk groups are: 229 spread out across 3 systems. 75 TGs - System 1 77 TGs - System 2 77 TGs - System 3

Things tried: I have altered the main code - Tried using usleep, nanosleep as well as calls to boost's sleep methods. There is an improvement of about 20% reduced CPU by using the boost sleep method. boost::this_thread::sleep( boost::posix_time::milliseconds(10) );

Downgraded from ubuntu 18.x to 16.x to "officially" use the "certified" operating system that Dell recommends for the T30 server. Same performance concerns.

================== Below are details/facts of what I have

How is recorder started? Recorder is started via systemctl with the following config: [Unit] Description=p25 Recorder StartLimitIntervalSec=0

[Service] Type=simple Restart=always RestartSec=3 User=CENSORED ExecStart=/home/CENSORED/recorder/recorder_start.sh

What is inside recorder_start?

!/bin/bash

echo "CENSORED Recorder" sync; echo 3 > /proc/sys/vm/drop_caches >>NOTE: This is here to resolve memory issues with RTL cards when restarting the service. It was discovered that sometimes they hang on to memory<< sleep 1 /home/CENSORED/recorder/recorder --config=/home/CENSORED/recorder/CENSORED.json > /home/CENSORED/recorder/logfile.log 2>&1

What does systemctl say? systemctl status p25recorder p25recorder.service - p25 Recorder Loaded: loaded (/etc/systemd/system/p25recorder.service; static; vendor preset: enabled) Active: active (running) since Thu 2019-01-03 15:29:35 EST; 2h 32min ago Main PID: 11697 (recorder_start.) Tasks: 694 Memory: 142.8M CPU: 5h 39min 11.391s CGroup: /system.slice/p25recorder.service ├─11697 /bin/bash /home/CENSORED/recorder/recorder_start.sh └─11701 /home/CENSORED/recorder/recorder --config=/home/CENSORED/recorder/CENSORED.json

Jan 03 15:29:35 CENSORED01 systemd[1]: Started p25 Recorder. Jan 03 15:29:35 CENSORED01 recorder_start.sh[11697]: CENSORED Recorder

What is showing high CPU? top, htop... Here is a partial snippet TOP for the PARENT PID. We can see that server load average is over 4.0 (quad core processor) - So this is bad. Also - It is showing 693 threads for the parent recorder.

Some snippets: ps -eo pcpu,pid,user,args | sort -k 1 -r | head -10 %CPU PID USER COMMAND 222 11701 root /home/CENSORED/recorder/recorder --config=/home/CENSORED/recorder/CENSORED.json 2.2 12437 farva htop 0.5 8374 farva /usr/lib/gnome-terminal/gnome-terminal-server 0.3 8096 farva /usr/bin/gnome-shell 0.3 7828 farva /usr/lib/xorg/Xorg vt2 -displayfd 3 -auth /run/user/1000/gdm/Xauthority -background none -noreset -keeptty -verbose 3

Data was obtained by: top -H -p [parentPID] NOTE NOTE - The below info IS ONLY FOR THE PARENT PID OF RECORDER. So all 693 threads are linked to it.

Another oddity here - It is showing a CPU use of 41.4 percent here. I am not sure if it's showing just ONE core or giving us an average of all 4 cores. In theory, a 4 core processor has 400% CPU - So I am not sure what TOP is doing here. I TOP is ran alone or htop is ran alone - It clearly shows recorder cosuming about 200% CPU and EACH CORE consuming around 50% CPU.

top - 18:05:24 up 14:34, 3 users, load average: 6.06, 6.16, 6.10 Threads: 693 total, 13 running, 680 sleeping, 0 stopped, 0 zombie %Cpu(s): 41.4 us, 14.5 sy, 0.0 ni, 44.0 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 8038548 total, 6483448 free, 888240 used, 666860 buff/cache KiB Swap: 8253436 total, 8253436 free, 0 used. 6527744 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 11701 root 20 0 8078604 145188 60344 S 7.0 1.8 11:06.98 recorder 12084 root 20 0 8078604 145188 60344 S 6.6 1.8 0:49.18 p25_frame_ass49 11718 root 20 0 8078604 145188 60344 S 5.6 1.8 9:14.31 fix_cc892 12210 root 20 0 8078604 145188 60344 R 5.6 1.8 8:47.63 fix_cc152 12105 root 20 0 8078604 145188 60344 S 5.3 1.8 8:33.51 fix_cc300 12107 root 20 0 8078604 145188 60344 S 5.0 1.8 7:13.02 fft_filter_c104 11719 root 20 0 8078604 145188 60344 S 4.6 1.8 7:09.33 fft_filter_c105 12089 root 20 0 8078604 145188 60344 S 4.6 1.8 2:23.09 fft_filter_cc45 12212 root 20 0 8078604 145188 60344 R 4.6 1.8 7:11.96 fft_filter_c107 12011 root 20 0 8078604 145188 60344 S 4.3 1.8 6:10.10 fix_cc448 12315 root 20 0 8078604 145188 60344 S 4.0 1.8 6:11.52 fix_cc4 11823 root 20 0 8078604 145188 60344 S 3.6 1.8 6:03.06 fix_cc744 11917 root 20 0 8078604 145188 60344 S 3.6 1.8 6:08.47 fix_cc596 12074 root 20 0 8078604 145188 60344 S 3.6 1.8 0:50.39 fft_filter_cc47 11723 root 20 0 8078604 145188 60344 S 2.6 1.8 3:57.70 feedforward_106 11761 root 20 0 8078604 145188 60344 S 2.6 1.8 3:50.15 copy982 11776 root 20 0 8078604 145188 60344 S 2.6 1.8 3:56.62 copy958 11806 root 20 0 8078604 145188 60344 S 2.6 1.8 4:06.90 copy910 11994 root 20 0 8078604 145188 60344 S 2.6 1.8 3:30.34 copy614 12268 root 20 0 8078604 145188 60344 R 2.6 1.8 3:42.64 copy218 11724 root 20 0 8078604 145188 60344 S 2.3 1.8 2:57.89 gardner_cost106 11731 root 20 0 8078604 145188 60344 R 2.3 1.8 3:37.83 copy1030 11746 root 20 0 8078604 145188 60344 S 2.3 1.8 3:43.37 copy1006 11791 root 20 0 8078604 145188 60344 S 2.3 1.8 4:01.51 copy934 11824 root 20 0 8078604 145188 60344 S 2.3 1.8 3:07.68 copy882 11870 root 20 0 8078604 145188 60344 S 2.3 1.8 3:15.89 copy810 11885 root 20 0 8078604 145188 60344 S 2.3 1.8 3:20.10 copy786 11918 root 20 0 8078604 145188 60344 S 2.3 1.8 3:14.75 copy734 ... Truncated for sharing.

Difference between usleep and using boost or std::thread sleep Substantial difference in CPU use. But still not what I would expect.

usleep_new

thread_sleep_with_chrome_nano_seconds_new

robotastic commented 5 years ago

Thanks for running top... recorder should definitely not be chewing up that much CPU. On my quad I7, I never see the main thread consuming any noticeable CPU time. Let me do some googling... Maybe there is some weird compiler thing. It does seem like the switch to boost for sleep is worth it though, I will go test that out.

robotastic commented 5 years ago

Another thing to note, is variable clock speed. With frequency scaling, my processor will idle down to 800MHz and my CPU utilization will jump up to 200% of 800% but that is for an idled processor. try the cpupower prorgram to see what the processor speed is

robotastic commented 5 years ago

PS - 10 milliseconds maybe the wrong value. It could be fine to goto 100ms. I would try playing with the number. Each P25 system should be generating ~40 messages per second, and the sleep is only called when there are no messages left on the queue to be processed.

FarvaTechnology commented 5 years ago

Luke... Sounds good. I honestly believe it may have something to do with the Xeon processor.

I3,5,7 Celeron and most other CPUs are "retail" type CPUs. The Xeon is mostly for business and it has special stuff for encryption and verifying data processed.

I will also see if there is a why to disable those features in BIOS.

On Thu, Jan 3, 2019, 9:55 PM Luke Berndt <notifications@github.com wrote:

Thanks for running top... recorder should definitely not be chewing up that much CPU. On my quad I7, I never see the main thread consuming any noticeable CPU time. Let me do some googling... Maybe there is some weird compiler thing. It does seem like the switch to boost for sleep is worth it though, I will go test that out.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/robotastic/trunk-recorder/issues/207#issuecomment-451341757, or mute the thread https://github.com/notifications/unsubscribe-auth/Ap-QsbVGSqlHm3MFwVsxZMCkZ_PnFdvqks5u_sKvgaJpZM4Zl4D7 .

robotastic commented 5 years ago

I did find this: https://stackoverflow.com/questions/16726872/cpu-high-usage-of-the-usleep-on-cent-os-6-3 I wonder if some different kernel flags get set for a Xeon. There are also different CPU freq management plans, maybe one of those might help ramp the freq up to lower utilization. Still... sleep should not be using that much CPU.

I will keep digging, just came across this too: https://serverfault.com/questions/738097/high-cpu-caused-by-pthread-cond-wait-or-nanosleep

kg6uyz commented 5 years ago

Im running trunk-recorder 3.0.1, the main process CPU usage percentage for me is on average 250 percent or more. I run 5 sdr dongles, 7 digital recorders on the first sdr, 5 analog recorders on the second sdr, 4 analog recorders the third SDR, 2 analog recorders on the fourth sdr and 2 analog recorders on the fifth sdr, so 20 total. I used to run a an additional 4 analog recorders on the fourth dongle but i dont hear that system very well so i figured i was using resources i didnt need to since i dont hear that site very well, the main process CPU usage was much higher then.

trunk-recorder_parent_process_

kg6uyz commented 5 years ago

Im running trunk recorder on ubuntu 16.04 on a I5-2400S.

FarvaTechnology commented 5 years ago

ALL - 1st of all - THANK YOU SO MUCH for following this ticket and sharing your configs! It really means a great deal to me that we have amazing folks supporting each other! We get faster support here than with most paid services that keep you on hold forever.

Some updates on my end: 1) I disabled ALL things BIOS related that could possibly impact things.
Disabled all virtualization Disabled all speed enhancement settings related to boosting, hyper, etc. Confirmed that all security features related to CPU were disabled.

2) I tried altering the milliseconds of wait by multiples of 4. 10/40/80 - I see a little change. I am up to 80 milliseconds using std::this_thread::sleep_for(std::chrono::milliseconds(80)) and also trying the original usleep of 8000 X 10.

I will keep at this until I find the issue. If anyone has ideas, theories, etc. I am willing to try them.

Thanks again everyone! YOU ROCK!

FarvaTechnology commented 5 years ago

Here is a snippet of an strace with usleep(80,000) aka 80,000 microseconds = 80 milliseconds

I am not familiar w/ usecs per call... But based on my research... It appears that this count is rather high.

Command: strace -T -c -p [PARENT RECORD PID] - One that is running near 200%.

Using: usleep(80000);

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 99.00    0.555200       32659        17           nanosleep
  1.00    0.005600        5600         1           restart_syscall
------ ----------- ----------- --------- --------- ----------------
100.00    0.560800                    18           total

Using: std::this_thread::sleep_for(std::chrono::milliseconds(80));

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 93.80    0.544672       60519         9           nanosleep
  6.20    0.036000       36000         1           restart_syscall
  0.00    0.000000           0         2           write
------ ----------- ----------- --------- --------- ----------------
100.00    0.580672                    12           total

Using: boost::this_thread::sleep( boost::posix_time::milliseconds(80) );

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 96.30    0.580000       23200        25        12 futex
  3.70    0.022265       22265         1         1 restart_syscall
------ ----------- ----------- --------- --------- ----------------
100.00    0.602265                    26        13 total
FarvaTechnology commented 5 years ago

FIX FIX FIX This reduced my CPU from 200% to about 60%!!!!

Ok guys and gals - I am not sure what other impact this will have - But I found an article where a guy did a bunch of timing tests using a 3ghz system in a multi-core CPU w/ the whole linux sleep concern. Source: http://charette.no-ip.com:81/programming/2013-08-05_Sleeping/

He recommended this: sudo apt-get install cpufrequtils sudo cpufreq-set -r -g performance

**NOTE: The above will be reset on reboot! In order for me to have success, I wacked out the default start-up for cpufreq-set (/etc/init.d/cpufreq) and added a new service called p24cpuadjust.service to systemctl. This points to a script called p25cpuadjust.sh.

/etc/systemd/system/p25cpuadjust.service

[Unit]
Description=p25cpuadjust
StartLimitIntervalSec=0
After=p25recorder.service

[Service]
Type=simple
User=root
ExecStart=/home/CENSORED/recorder/p25cpuadjust.sh

[Install]
WantedBy=multi-user.target

SCRIPT

#!/bin/bash
echo "p25 Recorder - Adjusting CPU wait"
sleep 90
echo "p25 Recorder - Adjusting CPU"
/usr/bin/cpufreq-set -r -g performance

ENABLE SERVICE FOR STARTUP systemctl enable p25cpuadjust.service

Since my initial p25recorder is started via systemctl (FIRST) - I simply added the AFTER tag inside the p25cpuadjust.service so it will start AFTER my p25recorder.service has fired up.

Inside the p25cpuadjust.sh script - I wait about 90 seconds to allow the original p25recorder to startup and initialize all the frequencies and talk groups. THEN - It fires off the cpuadjustment.

I found that if I tried to cpuadjust BEFORE p25recorder was started - It had odd results. Sometimes CPU would be set at 800mhz instead of 3.6ghz (OUCH!)

This is hacky... But it was the only way I could get it to work consistent.

There are other cpufreq settings like: ondemand, powersave, etc. But "performance" is the ONLY setting that fixed my issue. Setting it to powersave resulted in high cpu again.

If you want to test this... You can have your p25recorder up and running and issue the "/usr/bin/cpufreq-set -r -g performance" command and it will adjust your CPU "on the fly" so you can see if it works in your instance.

If you want to confirm your CPU speed... Simply run "cpufreq-info"

Results: I executed the above and my CPU dropped DRAMATICALLY!!!! And this is using the default usleep(1000 * 10) code that already exists within recorder source main.cc.

I was able to add 3 more systems (THAT IS 3 more p25 systems)!!!! So by implementing this hack/fix, I was able to squeeze a TON more resources out of my server! And that is using the default recorder source code! :)

With this fix - Here is what we have: 7 card system w/ 10 digital recorders per card. Monitoring 6 p25 networks. 348 Talkgroups total. E3-1225 Xeon CPU - Quad core Average CPU is now: 110% average (it was about 65% CPU w/ 3 p25 systems) Total threads: 1,204 Memory used: 500mb/8gb Load Average: 3.00 (4.0 is my max)

Related articles: https://lenovopress.com/lp0826.pdf

https://www.linuxdays.cz/2017/video/Giovanni_Gherdovich-CPU_frequency_scaling.pdf

https://askubuntu.com/questions/410860/how-to-permanently-set-cpu-power-management-to-the-powersave-governor

Other Notes Article https://lenovopress.com/lp0826.pdf spoke about "C-State" and although the T30 Dell server has C-State defined within the BIOS - Turning it on/off had no effect on the high cpu issue I encountered.

Summary Please research this guys before implementing on your system. If this turns out to be the "fix" for the high CPU use on linux - This may be a nice addition to the installation instructions if we can find an easier way to set it up.

Thanks again folks! Your assistance in this matter was amazing!

robotastic commented 5 years ago

Awesome!! That makes a lot of sense... your CPU was throttling back the CPU freq and as a result, the % CPU usage was up. Best to think of it as (CPU Freq / Usage) or something like that. I keep mine in "power-save" mode instead of Performance, to lower power usage and keep the fan off.

Try running the same 6 systems in Power-Save mode. It should auto scale the Freq and handle things just as well. It just opts to keep the CPU freq a little lower and CPU utilization a little higher.

This all still doesn't explain all the nano_sleep syscalls. I am going to look into this more.

One last thing to check, can you look at the number of control channel messages you are decoding for each system? It should be about 38 msgs per second. Even with good audio decode you can have low control channel decode.

To get this, add "controlWarnRate": -1 to the top level of config.json

If the Message rate is low, sleep will be called a lot. With 6 systems though, there should be tons of messages.

FarvaTechnology commented 5 years ago

robo - Thanks again! You guys rock!

I tried the powersave mode - And the CPU just shot back up to 200+% again. :(

According to Dell - The BIOS is supposed to handle a ton of this stuff. And even though ubuntu is "certified" for this server (16.04 only) - It appears that some of the BIOS settings don't work inside Ubuntu for some reason. One would think it wouldn't matter.

As far as the nanosleep and Linux - There are a TON of articles out there of people having issues with the newer Kernels and sleep in some multi-core ENVs. Especially on GHZ systems.

One article I found said that the source code needs to test for the clock speed of the CPU and adjust it's coding to map the correct microseconds or milliseconds for sleep applications. There was even some articles talking about testing between linux OS and Windows and how that differs.

ONLY ME would elect to use a Xeon processor! hahaha

But hey! At least we found a hack.

Thanks again to everyone! Your help is amazing.

Lets keep this ticket open a few more days to see how everything runs... And if this appears to be a solution, we can close it out if that is good with you guys.

FarvaTechnology commented 5 years ago

STATUS UPDATE

All... The addition of the cpufreq tools fixed my problem! I have successfully ran several days with AMAZING server efficiency!!!!

Before installing cpufreq tools - I had 200+ % CPU (4 core) and I was only able to monitor two p25 systems. Even then my average server load was nearly 6-8 (way above the 4.0 cutoff).

After adding cpufreq tools - I was able to add several more p25 systems. And I have consistently been running at 100%CPU or so (out of 400%) and sever load is 1.96 on the 15 minute average scale!

With this increased performance... I could probably squeeze in 2-3 more p25 systems and stay under my 4.0 server average threshold.

The only note is... YOU MUST RUN CPUFREQ TOOL AFTER trunk-recorder is up and running.

I would consider this ticket closed.

bctrainers commented 5 years ago

Interesting findings there. For me, I use a HackRF One to monitor a singular P25 system. So there's no array of RTLSDR's going on here. It is easily able to cover the entire swath of spectrum that the P25 system is configured to use - and then some. I too, have the moderate to high CPU usage, only with 10 recorders configured and ready from the single HackRF. Anything less with the CPU cores, I end up seeing OOOOOOOOOOO's all over the place in the console. Decode rates are quite bouncy.... from 24 to 40/sec.

For this VM, i've thrown all 12 cores available at this particular installation. As you can see below, it's having a joyride with all the cores.

VM: mobaxterm_2019-01-07_23-39-32 VM Host: mstsc_2019-01-07_23-42-16

Now to find something like cpufreq that works on a VMWare ESXi/Workstation system......