nbdd0121 / wsld

WSL Daemon - Stable X11 connection and time synchronisation for WSL2
Apache License 2.0
298 stars 24 forks source link

CPU usage is high #4

Closed ati46 closed 3 years ago

ati46 commented 3 years ago

I am very happy that this can solve the problem that I am currently facing the disconnection of the wsl2 and x410 link after replacing the wifi and the cover, but I found during the use process that when there is content to be transferred, such as shaking the mouse, opening a new application window , It will become very slow. I use the top command to find that x11-over-vsock will occupy a higher CPU at this time. Excuse me, can optimization be done in this regard

ati46 commented 3 years ago

image

nbdd0121 commented 3 years ago

I looked into the issue, and find a few possible causes:

I tried to address these issues in branch tokio. Can you try https://github.com/nbdd0121/x11-over-vsock/actions/runs/443731309 to see if makes CPU usage better?

ati46 commented 3 years ago

@nbdd0121 Yes, I will try and give feedback. Thank you

ati46 commented 3 years ago

image I tried the new version, and the current performance is better than the previous version, but there is still a certain spike. In addition, the cpu will be 100% directly when switching wifi. It is not clear whether it is caused by the new version or my device. I will test the feedback again for this problem.

ati46 commented 3 years ago

Another question is after using x11-over-svock Starting x11 via startxfce4 will be very slow

nbdd0121 commented 3 years ago

When using x11 client directly on a TCP x11 server, there is just a single syscall to send on a TCP socket. With x11-over-vsock in between, there will be at least 4 syscall involved (x11 client need one to send on unix domain socket, x11-over-vsock need one to poll event, one to receive from unix domain socket and one to send on Hyper-V socket). So CPU usage will be around 4 times higher.

I tried to use x11perf to perform a stress time, and it is pretty much in line with the analysis above (with x11perf directly connected to TCP server, it is 35% CPU, with x11-over-vsock in between, it is 10% for x11perf and 70% for x11-over-vsock).

With two proxies in userspace I would have to say it is difficult to drive down the overhead. A few possible further way to optimise would be:

As for the speed issue I guess it's the same, the proxies add latencies. I couldn't observe the effect of 100% when switching wifi.

From your screenshot I can see you're running a window manager which could make the situation worse, because the X11 server will need to redirect some requests to the window manager, so instead of going through two proxies some message will need to go through 6 proxies (WSL X client->Windows X server, Windows X server->WSL WM, WSL WM -> Windows X server). Do you have a particular reason to run a window manager instead of running your X server in multiwindowed mode?

ati46 commented 3 years ago

Thank you. I understand that normal use will not be affected.

 I started the desktop window when testing x11-over-vsock, and I will try to test feedback using x410's app mode. The current situation can be used on a daily basis. thank you very much for your efforts.

ati46 commented 3 years ago

In app mode, even if you swipe the text up and down quickly in the IDE, x11-over-vsock will only take up 20% of the cpu at most.

nbdd0121 commented 3 years ago

I tuned up buffer size from tokio's default 2048 to 4096 per connection per direction and it shows some additional performance gain (~50%). I stick with 4096 as tuning it further up to 8192 only provides some marginal additional performance at 2x memory consumption.

Profiling shows that 90% time is spent on syscalls (send/recv/poll), so I don't think there can be any further significant improvement. You're welcome to try the latest code on master branch and provide feedbacks.

ati46 commented 3 years ago

OK. I have recompiled the master branch code and will give feedback after use