cntools / cnping

Minimal Graphical Ping Tool
Other
235 stars 35 forks source link

cnping makes Xorg to 100% single core CPU usage #68

Closed BondarenkoArtur closed 4 years ago

BondarenkoArtur commented 4 years ago

I've updated cnping 5 days ago from AUR repo, and newer version makes Xorg eat 100% cpu time for one core...

$ uname -a
Linux msi 4.19.118-1-MANJARO #1 SMP Thu Apr 23 10:54:30 UTC 2020 x86_64 GNU/Linux
cnlohr commented 4 years ago

Do you know when the older version was from?

BondarenkoArtur commented 4 years ago

As I can see from AUR changes previous version was from another repo... https://aur.archlinux.org/cgit/aur.git/log/?h=cnping-git

BondarenkoArtur commented 4 years ago

As I found issue in 8edb903 commit. In previous 3951f0e I don't have 100% cpu usage issue...

BondarenkoArtur commented 4 years ago

Ok, I'm not sure, but it seems that I've this issue because of XDrawPoint(...) function in CNFGDriver.c line 481 in master branch. This line was added first time in 8edb903. If I'm removing this line I don't have issue with 100% usage anymore. But maybe I broke something, I don't know yet...

dreua commented 4 years ago

from another repo

That's the same repository, cnlohr just moved it here.

I also have 100% Xorg on the default of 0.02 s ping time but if I turn it down, e.g. cnping google.com 0.5 it's less, is this the same you are experiencing?

BondarenkoArtur commented 4 years ago

As for me on 0.5 I still have some CPU load, and 0.02 isn't really stable for me. When I commented one line that I mentioned before, everything is smooth... Short preview (Sorry for some blur in video, imgur service changed resolution)

cnlohr commented 4 years ago

Woaoaahhh that is painful. I guess my big question is if your version is an OpenGL, rasterized or a pure X version. I'm also mega curious what is up with your X. Is something like glxgears smooth?

What about xeyes if you move your mouse all over?

Also, can you run x11perf -srect100 and report the results?

BondarenkoArtur commented 4 years ago

I don't have xeyes, but glxgears is perfectly smooth with 60fps and almost no load to Xorg. x11perf is probably needed to be installed somehow separately...

cnlohr commented 4 years ago

I don't know what package it would be in for manjaro, but it's certainly worth installing, since it looks like on all the public builds (unless you have a super weird one) the default is to use standard X11 drawing primitives.

BondarenkoArtur commented 4 years ago

Sorry for a delay, I've installed x11perf and here is results:

$ x11perf -srect100
x11perf - X11 performance program, version 1.2
The X.Org Foundation server version 12008000 on :0
from msi
Sat May 16 15:53:03 2020

Sync time adjustment is 0.0687 msecs.

    1800000 reps @   0.0032 msec (312000.0/sec): 100x100 stippled rectangle (8x8 stipple)
    1800000 reps @   0.0032 msec (312000.0/sec): 100x100 stippled rectangle (8x8 stipple)
    1800000 reps @   0.0032 msec (312000.0/sec): 100x100 stippled rectangle (8x8 stipple)
    1800000 reps @   0.0032 msec (311000.0/sec): 100x100 stippled rectangle (8x8 stipple)
    1800000 reps @   0.0032 msec (308000.0/sec): 100x100 stippled rectangle (8x8 stipple)
    9000000 trep @   0.0032 msec (311000.0/sec): 100x100 stippled rectangle (8x8 stipple)
cnlohr commented 4 years ago

Woah, ok, so your X11 is plenty fast. If you are up for properly debugging this I would be appreciative. If you can download, and build the demo app for rawdraw. i.e.

git clone https://github.com/cntools/rawdraw

BondarenkoArtur commented 4 years ago

I would be glad to help fix this issue. :) I was building just by using make command without arguments. Performance of rawdraw really depends on cpu clock. Probably because it uses 100% cpu usage for Xorg (single core only) and only around 3% for rawdraw

$ ./rawdraw
Key: 65293 -> 0
FPS: 10
FPS: 11
FPS: 8
FPS: 10
FPS: 11
FPS: 10
FPS: 10
FPS: 10
FPS: 10
FPS: 11
FPS: 10
FPS: 10

this is for 1920x1060 window, but it seems that for smaller I've almost the same FPS. Also I've tried with higher cpu frequency, seems to be higher FPS, but 60 isn't reached at all... On different machine I've 62-63 FPS with 70-75% Xorg CPU usage and 30-35% rawdraw usage. Values for CPU% from htop in both results.

BondarenkoArtur commented 4 years ago

As I can see significant performance drop appeared in Make rawdraw able to do full-screen, with xshape and in an opengl context. cnlohr 5/10/18, 6:47 AM commit with 756e2446 hash.
Probably something in CNFGXDriver.c.
Trying to understand what exactly caused a problem...

BondarenkoArtur commented 4 years ago

I also tried to remove XDrawPoint call from void AGLF(CNFGTackSegment)( short x1, short y1, short x2, short y2 ) function and got significant performance boost up to 60fps. I'm not really seeing any difference in rendered image without it, why does this call needed?

(In previous commits I had around 30fps instead of 10)

cnlohr commented 4 years ago

Hmm... So, it sounds like your issue may be if it's compiled with OpenGL? Can you share the exact compile line you're getting? Also - I hadn't thought about the performance implications of including or not including the dot in that way. The dot is because on some implementations, lines don't include the end or beginning points.

BondarenkoArtur commented 4 years ago
$ git clone https://github.com/cntools/cnping.git --recurse-submodules

Cloning into 'cnping'...
remote: Enumerating objects: 18, done.
remote: Counting objects: 100% (18/18), done.
remote: Compressing objects: 100% (14/14), done.
remote: Total 601 (delta 6), reused 10 (delta 3), pack-reused 583
Receiving objects: 100% (601/601), 443.57 KiB | 267.00 KiB/s, done.
Resolving deltas: 100% (367/367), done.
Submodule 'rawdraw' (https://github.com/cntools/rawdraw) registered for path 'rawdraw'
Cloning into '/tmp/cntools/cnping/rawdraw'...
remote: Enumerating objects: 218, done.        
remote: Counting objects: 100% (218/218), done.        
remote: Compressing objects: 100% (158/158), done.        
remote: Total 459 (delta 130), reused 139 (delta 60), pack-reused 241        
Receiving objects: 100% (459/459), 197.78 KiB | 969.00 KiB/s, done.
Resolving deltas: 100% (272/272), done.
Submodule path 'rawdraw': checked out 'e4b237e207cc847f65caa4268e61d26ac8fa2b85'

$ cd cnping

$ make

cc -s -Os -I/opt/X11/include -Wall -o cnping cnping.c ping.c httping.c -lX11 -lm -lpthread -s -L/opt/X11/lib/
i686-w64-mingw32-windres resources.rc -o resources.o  -DWIN_USE_NO_ADMIN_PING
make: i686-w64-mingw32-windres: No such file or directory
make: *** [Makefile:30: resources.o] Error 127

Just found that I'm relieving error, but cnping is compiled and working

dreua commented 4 years ago

Just go with make cnping to silence that error. Problem is that make all tries to build the windows binary and couldn't find the compiler for that.

So just to make sure: it is working but still with bad performance, correct?

OT: @cnlohr maybe this make all issue is bigger than we thought...?

BondarenkoArtur commented 4 years ago

Correct, with make cnping it builds without problems. And yes, it still with bad performance. Let me know, if I can help somehow with this.

cnlohr commented 4 years ago

@dreua Do you think we shouldn't build the .exe with all? I'm OK with that.

@BondarenkoArtur can you post your compositor configuration and hardware specs? I am very curious what xserver you are running. Also, please execute the following command and let me know the output.

xvinfo

Specifically I'm curious: (1) Are you using metacity/compiz/marco/compton/etc? (2) Are you running on an AMD or Integrated Intel card? I currently only have access to Intel and NVIDIA.

BondarenkoArtur commented 4 years ago
$ xvinfo
X-Video Extension version 2.2
screen #0
  Adaptor #0: "GLAMOR Textured Video"
    number of ports: 16
    port base: 153
    operations supported: PutImage 
    supported visuals:
      depth 24, visualID 0x21
    number of attributes: 5
      "XV_BRIGHTNESS" (range -1000 to 1000)
              client settable attribute
              client gettable attribute (current value is 0)
      "XV_CONTRAST" (range -1000 to 1000)
              client settable attribute
              client gettable attribute (current value is 0)
      "XV_SATURATION" (range -1000 to 1000)
              client settable attribute
              client gettable attribute (current value is 0)
      "XV_HUE" (range -1000 to 1000)
              client settable attribute
              client gettable attribute (current value is 0)
      "XV_COLORSPACE" (range 0 to 1)
              client settable attribute
              client gettable attribute (current value is 0)
    maximum XvImage size: 8192 x 8192
    Number of image formats: 2
      id: 0x32315659 (YV12)
        guid: 59563132-0000-0010-8000-00aa00389b71
        bits per pixel: 12
        number of planes: 3
        type: YUV (planar)
      id: 0x30323449 (I420)
        guid: 49343230-0000-0010-8000-00aa00389b71
        bits per pixel: 12
        number of planes: 3
        type: YUV (planar)

I've i7-7700HQ and GTX1050 as second card on my laptop. I'm not using compton or something like that. With compton nothing really changed, but maybe I need some specific configuration. ¯\_(ツ)_/¯

BondarenkoArtur commented 4 years ago

Here is also some additional info from inxi

System:
  Host: msi Kernel: 4.19.118-1-MANJARO x86_64 bits: 64 compiler: gcc 
  v: 9.3.0 
  Desktop: i3 4.18.1 info: polybar dm: LightDM 1.30.0 Distro: Manjaro Linux 
Machine:
  Type: Laptop System: Micro-Star product: GP62 7RD v: REV:1.0 
Graphics:
  Device-1: Intel HD Graphics 630 vendor: Micro-Star MSI driver: i915 
  v: kernel bus ID: 00:02.0 chip ID: 8086:591b 
  Device-2: NVIDIA GP107M [GeForce GTX 1050 Mobile] vendor: Micro-Star MSI 
  driver: nvidia v: 440.82 bus ID: 01:00.0 chip ID: 10de:1c8d 
  Display: x11 server: X.Org 1.20.8 driver: modesetting,nvidia 
  alternate: fbdev,intel,nouveau,nv,vesa tty: N/A 
  OpenGL: renderer: Mesa Intel HD Graphics 630 (KBL GT2) v: 4.6 Mesa 20.0.6 
  direct render: Yes 
cnlohr commented 4 years ago

interesting you're using GLAMOR. Which I never heard of but after some cursory google searches, it totally makes sense that it would be slow as molassas in this application because EVERY SINGLE element would become its own OpenGL draw call because of the intermixing of dots and lines. This is a deeper problem... I'm not exactly sure how to handle this here. Maybe we could detect if Windows and do the dots and dashes, but in Linux, we could be more intelligent about it. Just verifying, if you switch to removing the put pixel in the draw text function, everything goes fast, right?

BondarenkoArtur commented 4 years ago

Just verifying, if you switch to removing the put pixel in the draw text function, everything goes fast, right?

I'm not sure is this draw text function

void AGLF(CNFGTackSegment)( short x1, short y1, short x2, short y2 )
{
    XDrawLine( CNFGDisplay, CNFGPixmap, CNFGGC, x1, y1, x2, y2 );
    XDrawPoint( CNFGDisplay, CNFGPixmap, CNFGGC, x2, y2 );
}

I'm removing XDrawPoint line from this part of code, and everything goes fast.

dreua commented 4 years ago

I have just some ideas about how to fix this but unfortunately not the time to try them myself at the moment:

  1. If this is a problem only in cases where OpenGL is available (is that actually the case?) why don't we use OpenGL directly? I think it should be just a minor change in the Makefile, no?
  2. Speaking of OpenGL, I once fixed the same root issue of the missing pixel at line start / end in OpenGL by drawing a line loop instead of a line. Would this also be doable with Xorg's drawing system?
  3. I just found this on google: https://tronche.com/gui/x/xlib/appendix/c/graphics-batching.html Maybe graphics batching or "poly graphics primitives" (need to figure out what exactly that is tbh) could be a way to improve performance "by five times or more".
cnlohr commented 4 years ago

@BondarenkoArtur that's almost certainly because of Glamor. With GDI, X, etc. draw calls can be all wildly different, but OpenGL requires lines and points to be rendered as separate draw calls. If Glamor is not smart about this, then it's going to be a train wreck - and it sounds like Glamor isn't smart about this. @BondarenkoArtur Can you confirm this is a draw call issue instead of a geometry issue. You can do that by changing the code to:

void AGLF(CNFGTackSegment)( short x1, short y1, short x2, short y2 )
{
    XDrawLine( CNFGDisplay, CNFGPixmap, CNFGGC, x1, y1, x2, y2 );
    XDrawLine( CNFGDisplay, CNFGPixmap, CNFGGC, x2, y2, x1, y1 );
}

@dreua We could use OpenGL Directly. And yes, doing the loop would be suitable for some implementations, but there is still the issue of degenerate things, like the dot inside the number zero. or umlauts.

BondarenkoArtur commented 4 years ago

@cnlohr This one works great :+1: