ccMSC / ckb

RGB Driver for Linux and OS X
http://forum.corsair.com/v3/showthread.php?t=133929
GNU General Public License v2.0
1.34k stars 169 forks source link

Periodic high latency while CKB is open #67

Open zchristopoulos opened 9 years ago

zchristopoulos commented 9 years ago

This has been bugging me the past week or so, but I'm glad I finally found the culprit.

Whenever I have CKB open and running, I have periodic (every 5-10 seconds) latency spikes that take my ping to google.com from ~25 ms to ~150-175 ms for about 1 second.

Moment I close ckb, the latency stays at a fairly steady ~25ms. Any ideas what may be causing this?

ccMSC commented 9 years ago

My guess would be the automatic firmware update, which will re-download the list of firmwares periodically. You can disable it from the settings screen.

zchristopoulos commented 9 years ago

Tried disabling it and I'm still getting latency spikes.

ccMSC commented 9 years ago

Is your internet connection USB-based in any way? ckb itself doesn't access the internet, apart from firmware updates. Only thing I can think of is if the USB messages are interfering with the connection somehow. Does the CPU usage seem abnormally high (should be below 10%)?

zchristopoulos commented 9 years ago

Nope. One thing I should note though, this is /only/ while wireless. If I use my thunderbolt to ethernet adapter, the latency issue goes away. CPU usage is normally below 10%. Had one instance a while ago where ckb-daemon had a fairly high cpu usage for a little bit.

ccMSC commented 9 years ago

Interesting...which specific model is this? Also, does it happen only when you have animations running, or does the problem always happen when ckb is running?

Buffalox commented 9 years ago

I'm not sure, but this may be related, I noticed a pretty consistent 10% speed drop in Cinebench 11.5, with CKB in tray or open. For some reason it rarely doesn't happen at all, I can't figure out why it sometimes doesn't happen, but It seems that starting CKB after Cinebench prevents the problem. I now have a hotkey for starting and killing CKB.

I later measured the framerate of a game I'm currently playing (Path Of Exile), and my framerate dropped from almost 200 to about 50! That was when starting CKB after the game. Normally if I have Sync enabled it stays steady at 60 fps.

Both programs run through Wine 32 bit, I run Cinebench on 1.4 and POE on latest beta. I haven't tested yet for native or 64 bit programs. I run Manjaro testing 64bit on 4.0.4 Kernel. I use Nvidia proprietary driver.

Otherwise CKB is absolutely awesome, great tool and great installation guide. I still use it even with the framerate drop. :)

Buffalox commented 9 years ago

Oh I forgot, I use K95 keyboard.

ccMSC commented 9 years ago

Not sure if they're related but that's definitely helpful info. Performance hasn't been my main concern so far, but I'll do some poking around and see if I can find out what's causing this.

Buffalox commented 9 years ago

I just made a couple of 64 bit tests with Phoronix test suite, there are apparently no issues with any of UE4 tests, but Scimark2 is about 3% slower.

It's pretty weird htop puts ckb at 9% CPU usage, top at 4% and KDE system activity at just 1%? All for the same instance. I tried to decrease ckb priority and run cinebench which still resulted in a 10% performance, I then additionally raised cinebench priority and still had the 10% drop.

It looks like something in CKB can optionally be interrupted by the OS, but for some reason it doesn't work equally for different programs.

ccMSC commented 9 years ago

I think the problem might be the automatic settings save, which seems to cause disk activity even if the settings haven't changed. It should be possible to fix it by keeping the data in memory - stay tuned.

Buffalox commented 9 years ago

Sorry for not getting back sooner, I just had my system SSD fail on me, and when I was rebuilding my system my water cooling had a leak. Talk about a bad couple of days.

Now I'm on a completely fresh Arch system, and have tested CKB beta 0.1.4, the game Path of Exile now has zero framerate drop, and stays at 200 fps at my "test spot". Cinebench stlil has a drop but at 7% now instead of 10%.

Before I had removed all animations, and trimmed my profile to only the three settings I can choose among with the M1-M2-M3 buttons. it didn't seem to make any difference, the reinstall added breathing and trippy but it doesn't seem to have any negative impact. Changing fps in CKB didn't seem to do anything either. I haven't got my monitoring up completely just yet, but htop still reports CKB at 10% CPU use, while KDE only shows 1-2%. I think the difference is because HTOP shows 1 thread relative to 1 core, while KDE system activity collects the threads and show them relative to the total of all cores. I have 6 cores so 10% for one core is within 1-2% for all cores.

Thanks for the great improvement, it absolutely awesome. :)

ccMSC commented 9 years ago

I've got a potential fix in the testing repo. It caches the most common settings in memory so that ckb won't access the disk unless it needs to. It also syncs settings in the background so that the GUI stays responsive. Not 100% certain that this is the cause of @zchristopoulos's latency problem, but it's my best guess at the moment.

JackFarrand commented 9 years ago

getting 29% cpu usage reported by ps -aux for ckb on my machine. that's a bit much! any idea what's causing it? can the keyboard handle it's own animations in hardware? anytime I quit ckb the animations stop so I assume that's what's chewing my cpu.

ccMSC commented 9 years ago

The animations are all software-based; the keyboard hardware can't do its own animations.

Re. CPU usage, see: https://github.com/ccMSC/ckb/issues/91

JackFarrand commented 9 years ago

Okay cool. Have you profiled it to see what uses the most cpu time? Might have a stab at that myself if you havent.

ccMSC commented 9 years ago

Offhand, I'd guess it's either:

  1. Constant polling of /dev nodes to check whether a new device has connected. This currently runs on the same timer as the animations, i.e. 30FPS. It could be polled much less frequently, maybe 5-10 times a second.
  2. Always sending updated lighting to the daemon even when there are no animations active. The binding and performance setups are already optimized so that they don't send data unless they actually need to, but lighting hasn't received this optimization.
  3. It's possible that the GUI is being updated unnecessarily, even when you're not doing anything. As far as I know the GUI is completely passive as long as the app is in the background, so this would require some digging to find out what (if anything) is being called.

1 and 2 are relatively simple to fix. I haven't looked into it to see if those are the actual causes, but they need to be done at some point anyway so that would be a good place to start.

skandalfo commented 9 years ago

I did some preliminary profiling on ckb running in my setup, where htop reports 24% of CPU usage.

IIRC AnimScript::readProcess() and KbAnim::blend() were at the top of the list for CPU time spent, with a big deal of that being QHash<QSrtring, QRgb> ops.

I made some hasty tests, like moving alpha blending functions to use integer arithmetics, or trying to force inlining of the blend functions by templating the KbAnim::blend() function on the actual operation function. None of that made a difference, but I confirmed that just adding a "return" as the first line in KbAnim::blend() will reduce CPU usage to 14%, so it looks like it's a plausible optimization target.

ccMSC commented 9 years ago

Iterators might have something to do with it too. IIRC the Qt documentation states that the STL-style iterators are fastest, whereas I usually used foreach loops or Java-style iterators.

JackFarrand commented 9 years ago

Hrm, have we given any thought to offloading some of these blend functions to the GPU? Shaders working on a 128x64 or similar texture could take the load off the CPU and even work faster. The code base would suffer in scope size though.

ccMSC commented 9 years ago

I really don't think that's worth it. The animations are software-based anyway so you couldn't offload the entire pipeline even if you wanted to. You'd probably waste more time transferring data in and out of the GPU than you'd actually save.

Given the very small number of data points being processed (~150 at most) it's likely that the problem has more to do with inefficient memory access as opposed to the actual CPU calculations.

JackFarrand commented 9 years ago

Makes sense.

OpenCoderX commented 8 years ago

I'm seeing between 8 and 13% cpu use with k95 rgb. It would be nice to see that down around 5% at most. 8 core AMD cpu, 16gb memory in my machine.

ccMSC commented 8 years ago

For anyone still following this, the latest update in the testing repo should provide some performance improvements. I'm seeing CPU usage down by ~25% on my system. May be more or less for everyone else depending on your hardware.

Aside from the extensive QMap operations (which are gone now), one of the biggest bottlenecks turned out to be AnimScript::readProcess(), which had a lot of QString conversions. Switching it out to raw QByteArrays helped a lot.

JackFarrand commented 8 years ago

@ccMSC brilliant news! Thanks bud!

skandalfo commented 8 years ago

@ccMSC Firstly, thanks for the fixes! Secondly, it looks like the update didn't make it to github? The last commit I see in https://github.com/ccMSC/ckb/commits/testing is from Jan 26th.

ccMSC commented 8 years ago

oops. I'd have sworn I pushed it last night, but it should be up for sure now.

JackFarrand commented 8 years ago

@ccMSC Success. Htop reporting 12.5% cpu usage on an i3 3240 prior to the patch and between 8% and 10.2% afterwards, awesome.

skandalfo commented 8 years ago

Will give it a shot as soon as I have some time.

I noticed you are still parsing key names to array indexes through ColorMap::colorForName in the AnimScript::readProcess loop, and you're using a binary search for that.

As the set of possible key names is known in advance, you might want to take a look at GNU gperf instead as an optimized way to convert names to indexes.

skandalfo commented 8 years ago

Tested; ckb @60Hz went from 23% to 17% in my machine after the changes. :-)

ccMSC commented 8 years ago

Interesting! I hadn't considered hash tables due to memory efficiency, but gperf looks like it could be viable. I might experiment with that later.

I probably won't do more optimizations for a while, as I intend to focus on profiles for right now. I'll merge the current changes into the main branch over the weekend. In the mean time, if anyone wants to do more profiling, feel free to let me know.