The Multiplexing is very slow

wez / wezterm

A GPU-accelerated cross-platform terminal emulator and multiplexer written by @wez and implemented in Rust

https://wezfurlong.org/wezterm/

Other

17.28k stars 781 forks source link

The Multiplexing is very slow #1872

Closed joexue closed 1 year ago

joexue commented 2 years ago

What Operating System(s) are you seeing this problem on?

Windows

WezTerm version

20220101-133340

Did you try the latest nightly build to see if the issue is better (or worse!) than your current version?

No, and I'll explain why below

Describe the bug

The functionality of TLS multiplexing is very good but a bit slow, if I directly use ssh connect to the host, every thing is fine but if I use the TLS multiplexing, it is very slow, I even can see a double line under the character I typed in. If I use delete the cursor even can go back beyond my prompt to the first column of the terminal. Since it it is connecting to company's server, I cannot capture what it is, sorry about that.

To Reproduce

configure a tls multiplexing server and connect to it.

Configuration

no specific

Expected Behavior

run at least as fast as ssh connection

Logs

No response

Anything else?

No response

wez commented 2 years ago

Can you describe in more detail what you're doing when you notice that it is slow? What is the ping/round-trip latency between the local and remote hosts? How large is the terminal (rows and columns)?

Have you tried adjusting the local_echo_threshold_ms setting? See bottom of https://wezfurlong.org/wezterm/config/lua/TlsDomainClient.html?highlight=local_echo_threshold_ms#tlsdomainclient for details

joexue commented 2 years ago

I notice it is slow because when I type, it response not instantly, as I describe in the original post I even can see the double line under the characters I type in in short time. for example below, if I type in "ls example" I will see the double under line from "l" to "e" in order for short time. and if I try to delete the "ls example" the cursor even can reach 'h".

host$ ls example

The network is a bit slow I know, ping round-trip time is 226ms. of course if I connect to a host with fast network, the symptoms not so significant. but if I use ssh directly connect to problematic server without multiplexing, the response is much faster. so I think multiplexing contributes more than network speed.

I use local_echo_threshold_ms = 10 and it doesn't help.

joexue commented 2 years ago

the terminal is not large, I use small terminal. 1920x1280 and font size is 14

joexue commented 2 years ago

try to use ssh domain instead of tls domain, the result is same.

joexue commented 2 years ago

to be more clear, the client is windows and the server is Linux, I cannot try to use local wezterm to connect mux server, since my server doesn't have GUI support.

joexue commented 2 years ago

I managed to record a video by phone and now you can see the problem

IMG_2216_MOV_AdobeCreativeCloudExpress

wez commented 2 years ago

Try setting local_echo_threshold_ms = 300 to discourage the local predictive echo?

joexue commented 2 years ago

Try setting local_echo_threshold_ms = 300 to discourage the local predictive echo?

looks not helping

joexue commented 2 years ago

updated the tile to remove TLS since I found it is not TLS specific, the problem happens whatever the multiplexing is by TLS or UNIX domain

digitallyserviced commented 2 years ago

I actually have had a similar issue. When using the mux unix domain, and navigating while holding down arrows or hjkl in neovim, the repeat actually gets stuck and I cannot even stop it without spamming opposite keys which then have their own delayed repeat until the cursor finally stops.

I have since stopped using the unix domain mux just because there is a huge delay sometitmes, while a direct wezterm-gui feels immediate and very responsive in comparison.

I can say that this seems to happen after a little while into the server running after it has used up some memory. Killing and restarting the process earns me some more responsiveness for a little while but in the end becomes enough of an issue to have left the feature altogether.

I will work on getting some data,logs,rec of the issue but I had been keeping an eye on this issue because of it mentioning slowness, and now that OP mentions it happening over even unix domain for them too I figured I should chime in as me too.

github-actions[bot] commented 2 years ago

This issue has been automatically closed because there has been no response to our request for more information from the original author. With only the information that is currently in the issue, we don't have enough information to take action. Please reach out if you have or find the answers we need so that we can investigate further.

digitallyserviced commented 2 years ago

Is this something related to the #1814 #1841 that may have been resolved with the recent b908e2d commit?

joexue commented 1 year ago

Used sometime to dig it, the slow looks from proxy more than network transmission somehow.

joexue commented 1 year ago

22:55:13.618 TRACE wezterm_term::te 22:55:13.618 TRACE async_io::driver 22:55:13.618 TRACE async_io::reactor 22:55:13.618 TRACE polling 22:55:13.618 TRACE polling::epoll 22:55:13.618 TRACE wezterm_mux_serv 22:55:13.618 TRACE polling::epoll 22:55:13.618 TRACE async_std::task::builder 22:55:13.618 TRACE async_io::driver 22:55:13.618 TRACE async_io::driver 22:55:13.618 DEBUG codec 22:55:13.618 TRACE polling::epoll 22:55:13.618 DEBUG wezterm_term::te 22:55:13.618 TRACE wezterm_term::te 22:55:13.618 TRACE polling::epoll 22:55:13.618 TRACE async_io::driver 22:55:13.619 DEBUG codec 22:55:13.619 TRACE polling::epoll _22:55:13.869 TRACE polling::epoll 22:55:13.869 TRACE polling::epoll 22:55:13.869 TRACE async_io::reactor 22:55:13.869 TRACE async_io::driver 22:55:13.869 TRACE async_io::driver 22:55:13.869 TRACE polling::epoll 22:55:13.869 TRACE async_io::driver 22:55:13.869 TRACE async_io::reactor 22:55:13.869 TRACE polling 22:55:13.869 TRACE polling::epoll 22:55:13.869 TRACE polling::epoll 22:55:13.869 TRACE polling::epoll 22:55:13.869 TRACE polling::epoll 22:55:13.869 TRACE async_io::reactor 22:55:13.869 TRACE async_io::driver 22:55:13.869 TRACE async_io::reactor 22:55:13.869 TRACE polling::epoll 22:55:13.870 TRACE async_io::driver 22:55:13.870 TRACE wezterm_term::te rminalstate::keyboard > key_down: sending "x", Char('x') NONE > main_loop: waiting on I/O > processtimers: 0 ready wakers > Poller::wait(, None) > wait: epoll_fd=5, timeout=None er_impl::sessionhandler > 384 processing time 155.85µs > modify: epoll_fd=5, fd=7, ev=Event { key: 18446744073709551615, readable: true, writable: false } > block_on > block_on() > block_on: sleep until notification > serialized+compress len 332 vs 2175 > modify: epoll_fd=5, fd=18, ev=Event { key: 1, readable: true, writable: false } rminalstate::performer > perform Print('x') rminalstate::performer > print x=24 y=6 print_width=1 width=80 cell=x CellAttributes { attributes: 0, intensity: Normal, underline: None, blink: None, italic: false, reverse: false, strikethrough: false, invisible: false, wrapped: false, overline: false, semantic_type: Output, foreground: Default, background: Default, fat: None } > modify: epoll_fd=5, fd=18, ev=Event { key: 1, readable: true, writable: false } > blockon: completed > serialized+compress len 315 vs 2181 > modify: epoll_fd=5, fd=18, ev=Event { key: 1, readable: true, writable: false_ } > new events: epollfd=5, res=1 > modify: epoll_fd=5, fd=6, ev=Event { key: 18446744073709551615, readable: true, writable: false } > react: 1 ready wakers > main_loop: sleeping for 50 us > main_loop: notified > modify: epoll_fd=5, fd=18, ev=Event { key: 1, readable: true, writable: false } > main_loop: waiting on I/O > processtimers: 0 ready wakers > Poller::wait(, None) > wait: epoll_fd=5, timeout=None > modify: epoll_fd=5, fd=7, ev=Event { key: 18446744073709551615, readable: true, writable: false } > new events: epoll_fd=5, res=1 > modify: epoll_fd=5, fd=6, ev=Event { key: 18446744073709551615, readable: true, writable: false } > react: 1 ready wakers > main_loop: sleeping for 50 us > readable: fd=18 > modify: epoll_fd=5, fd=18, ev=Event { key: 1, readable: true, writable: false } > main_loop: waiting on I/O rminalstate::keyboard > key_down: sending "i", Char('i') NONE

From the above log, you can see between the typing "x" and "i", there is a big pause happen from 22:55:13.619 to 22:55:13.869 This happened when I just type "exit" in terminal

the fd list

lrwx------ 1 joe joe 64 Dec 7 23:09 0 -> /dev/pts/2 l-wx------ 1 joe joe 64 Dec 7 23:09 1 -> /tmp/log2.txt lrwx------ 1 joe joe 64 Dec 7 23:09 10 -> 'socket:[419121606]' lrwx------ 1 joe joe 64 Dec 7 23:09 11 -> 'anon_inode:[eventfd]' lrwx------ 1 joe joe 64 Dec 7 23:09 13 -> /dev/ptmx lrwx------ 1 joe joe 64 Dec 7 23:09 14 -> 'socket:[419116866]' lrwx------ 1 joe joe 64 Dec 7 23:09 15 -> /dev/ptmx lrwx------ 1 joe joe 64 Dec 7 23:09 16 -> /dev/ptmx lrwx------ 1 joe joe 64 Dec 7 23:09 17 -> 'socket:[419116867]' l-wx------ 1 joe joe 64 Dec 7 23:09 2 -> /tmp/log2.txt lr-x------ 1 joe joe 64 Dec 7 23:09 3 -> /etc/passwd l-wx------ 1 joe joe 64 Dec 7 23:09 4 -> /run/user/1000/wezterm/wezterm-mux-server-log-21307.txt lrwx------ 1 joe joe 64 Dec 7 23:09 5 -> 'anon_inode:[eventpoll]' lrwx------ 1 joe joe 64 Dec 7 23:09 6 -> 'anon_inode:[eventfd]' lrwx------ 1 joe joe 64 Dec 7 23:09 7 -> 'anon_inode:[timerfd]' lr-x------ 1 joe joe 64 Dec 7 23:09 8 -> anon_inode:inotify lrwx------ 1 joe joe 64 Dec 7 23:09 9 -> 'anon_inode:[eventpoll]'

joexue commented 1 year ago

Hi @wez, I can provide more specific log if you need. The functionality is really good and useful, I hate I cannot use RUST to program. But I can make some change and build to test, if you have any idea.

cunha commented 1 year ago

I have just met what looks like the same problem when using a unix socket to connect to a remote host. The slowdown I am experiencing seems related to redrawing operations. Some observations:

If I keep the terminal at 80x25, then it performs reasonably well

If I maximize the screen (318x75) then

vim is extremely slow (0.5s delay when pressing j)

nano still performs reasonably well

One difference between the two editors is that vim updates the cursor position at the bottom right of the terminal, while nano does nothing other than modify the cursor position.

This is with local_echo_threshold_ms = 50000, and the latency between me and the server is around 40ms.

Edit: Just tested with an SSH domain and observed the same behavior. Without multiplexing (i.e., just SSH'ing into the server) does not lead to any slowdown even for maximized terminals.

sirfz commented 1 year ago

I experienced the same with both direct ssh and multiplexing. Using ssh from wsl works normally as already reported.

lrvdijk commented 1 year ago

I can also confirm this, and agreeing with cunha that vim is especially slow when multiplexing (using wezterm connect to a GCP VM). The lag is still noticeable with wezterm ssh but not as bad.

Smaller windows help (e.g. using several horizontal splits).

If there's anything I can do to help profile the issue, let me know.

cd-a commented 1 year ago

Since https://github.com/wez/wezterm/commit/d36ad7ca7f9054a9d2b49ffe8696c3e617623194 was merged I wanted to check again on latest master.

I can't connect to the mux server anymore though. @wez did you change something there since last release that I'm not aware of?

The standard config I used before does not connect to a local unix socket anymore, but works on latest release:

config.unix_domains = { { name = 'unix', }, } config.default_gui_startup_args = { 'connect', 'unix' }

wez commented 1 year ago

@cd-a share the error message(s) that you see?

cd-a commented 1 year ago

@wez I'd love to, but there are none :)

When I run this config on the release, and there is no socket open yet, on first terminal it briefly shows no socket yet, and creates it, then every terminal afterwards connects to it. So when I close, reopen, it loads the layout again.

On latest master, nothing. No message about missing socket, no attaching to socket. Just behaves as if there is no mux set up.

I tried killing the existing socket from the release version, that maybe master would do it differently, but no.

Any way I can log this verbosely?

wez commented 1 year ago

@cd-a please file an issue with complete repro steps and config

cd-a commented 1 year ago

@wez sure! Here it is: https://github.com/wez/wezterm/issues/3438

cd-a commented 1 year ago

So after wez helped me figure out how to properly start the mux server on the master build, here are my findings using:

4k, full screen wezterm

neovim latest

It's blazingly fast now, I can hardly notice a difference to non-mux performance now. Maybe a tiny fraction, but only noticeable if I compare them directly against each other... or maybe I'm even hallucinating that one. It just works great!

So I will now switch to the mux version, loving it. Great work wez!

joshuarubin commented 1 year ago

Can you specify any config and minimum wezterm build dates that are required to get fast mux on a 4k full screen? I'm running 20230326-111934-3666303c on my server and 20230330-214948-d1c2257b on my client and those may not be new enough.

wez commented 1 year ago

@joshuarubin https://github.com/wez/wezterm/commit/d36ad7ca7f9054a9d2b49ffe8696c3e617623194 or later

joshuarubin commented 1 year ago

@joshuarubin d36ad7c or later

thanks for the info. is that for both the client and server, or is it ok for one side to be older?

wez commented 1 year ago

While that change in theory only changed some logic on the server, there were some other multiplexer changes that bumped the CODEC_VERSION over the weekend, and the client will give up talking to a server with a mismatching CODEC VERSION. So you'll want to upgrade both the client and the server.

wez commented 1 year ago

This should be resolved now in main.

It typically takes about an hour before commits are available as nightly builds for all platforms. Linux builds are the fastest to build and are often available within about 20 minutes. Windows and macOS builds take a bit longer.

Please take a few moments to try out the fix and let me know how that works out. You can find the nightly downloads for your system in the wezterm installation docs.

If you prefer to use packages provided by your distribution or package manager of choice and don't want to replace that with a nightly download, keep in mind that you can download portable packages (eg: a .dmg file on macOS, a .zip file on Windows and an .AppImage file on Linux) that can be run without permanently installing or replacing an existing package, and can then simply be deleted once you no longer need them.

If you are eager and can build from source then you may be able to try this out more quickly.

cd-a commented 1 year ago

@joshuarubin for the config, please check https://github.com/wez/wezterm/issues/3438 where it's linked with the description.

joshuarubin commented 1 year ago

Thanks for your help. I think I've got it working.

New issue, I can make separately, but in case you are already aware, on main when I'm connected to an ssh domain and do win:perform_action(wezterm.action.SplitHorizontal({ domain = "DefaultDomain" }), pane) I get this:

22:13:19.082 WARN mux::domain > Directory "/home/<USERNAME>" is not readable and will not be used for the command we are spawning: No such file or directory (os error 2)

This is on MacOS and my home dir is in /Users, so I'm not sure why that's happening.

wez commented 1 year ago

@joshuarubin please file a separate issue so that this one can remain focused on the performance aspect

calops commented 1 year ago

I've tried the latest commit and the sluggish behavior hasn't disappeared for me. I don't know if it's faster than before, but it's still significantly slower than a normal SSH session. Scrolling in neovim, for example, is very slow.

cunha commented 1 year ago

My tests indicate UNIX domains are no longer impacted by the severe slowdown I observed before. Vim on gigantic terminal window now updates as fast as expected for a remote (SSH) session. 💯

I used

[zeus:~/git/wezterm main]% wezterm --version wezterm 20230405-073038-96f1585b

and the following UNIX domain config:

{ name = "zeus", socket_path = "Users/cunha/.local/share/wezterm/zeus.sock", proxy_command = { "ssh", "-T", "-A", "zeus", "/usr/bin/wezterm", "cli", "proxy" }, local_echo_threshold_ms = 50000, }

calops commented 1 year ago

I've tried with an identical unix_domains configuration and wezterm version 20230405-084502-37e8a504 and it behaves the exact same way as the normal ssh_domains solution: very sluggishly. Big vim windows are unusable.

I don't understand what I'm doing differently.

sirfz commented 1 year ago

I can confirm the experience is quite smooth with the nightly build. My setup:

OS: Windows 10 domain:

unix_domains = { { name = 'wsl-mux', serve_command = { 'wsl', 'wezterm-mux-server', '--daemonize' }, }, },

WSL: Ubuntu 20.04 (with nightly wezterm-mux-server)

I logon to WSL and just ssh and work on remote servers all day, working great!

lrvdijk commented 1 year ago

For me the performance has increased a lot too, so much that I can now use it as a daily driver! 🎉

Still feel it's a little bit slower than iTerm2 + ssh + Tmux, but it's good enough for me.

wez commented 1 year ago

@calops if you have vim plugins that cause every line to change when moving the cursor up or down (eg: displaying relative line numbers at the start of each line), then wezterm will still need to send a lot of data on each cursor movement. Perhaps there is something like that in your config that is triggering the worst case performance?

calops commented 1 year ago

@wez Yes, I do have something like this, but that's irrelevant for a simple use-case like "scrolling through a file" where lines are gonna get redrawn no matter what. And even simple stuff such as selecting text (which shouldn't redraw much) has a noticeable latency, which is made very obvious when doing it with the mouse.

The real problem here is that a plain ssh session still handles all this and performs perfectly fine compared to a wezterm mux session. It's faster by at least one or two orders of magnitude. By that I mean that the SSH session feels (almost) as fast as a local session.

So it seems obvious to me that wezterm adds a significant overhead to this. Is this a technical limitation for this solution, or is there hope that it could be improved somehow?

wez commented 1 year ago

Yes, there is more overhead because there obviously is more going on. I think we all hope for improvement here.

One of my hopes is that enough users are willing to lend support that I can afford to spend more time on issues like this one. https://wezfurlong.org/wezterm/sponsor.html has some more information about this.

supernomad commented 1 year ago

@wez I am still seeing this performance bottle neck as well with MacOS <> Linux and the latest nightly. I am more than happy to help dive into the code, but would need some assistance as to where to start. If you can give me some guidance as to where to start, I can take a whack at what is going on as I have a simple reproduction setup and can dive in over the next few weeks as I should have some spare time. Let me know if that makes sense to you!

wez commented 1 year ago

@supernomad thanks for the offer, but I think it will likely be more time consuming to bootstrap someone than for me to sit down and do the work. At a high level the issue is this:

The mux sync protocol is line-based. So if a line is mutated, it gets flagged as dirty and that line will then either be unilaterally pushed to the client, or listed in a dirty lines set for the client to pull when it needs to render it

Some terminal update operations are encoded in a few bytes of escape sequence, but can touch every line in the screen, amplifying the data that needs to be sent over the wire and driving up the latency

So the change granularity is O(changed-lines) and results in transmitting data that is O(changed-lines x terminal_width).

The end state I have in mind is:

At the mux layer, tap into the terminal output byte stream and journal it

Build up a client + cursor concept that enables that journal when a client subscribes to it. It needs to be zero cost if there are no subscribers, and the journal needs to be pruned as soon as all clients (there can be multiple!) have read past it

When subscribed, if the nature of the update being pushed by the server can be described by an incremental read from that journal, just send those bytes, otherwise send the line data.

The client would need to be able to apply both the delta line updates and the byte stream updates, which means building a local wezterm_term::Terminal and applying both sets of changes to a local copy of the terminal emulation

Rendering would need to be re-structured to run off of that model

That would bring the data to be processed down closer to O(output-bytes) in the best case.

These are pretty significant changes across several layers; it's not a small task.

cunha commented 1 year ago

I did some additional tests on a 136x566 window and indeed could see some slowdown with relativenumber set in Vim (although I'd say the terminal was still useable). On a more comfortable 77x318 window I could not notice any slowdown.

This is over the Internet when connected to a server that is 30ms away; one's mileage will likely vary depending on connection quality.

calops commented 1 year ago

Yes, there is more overhead because there obviously is more going on.

That part wasn't obvious to me, actually. What specifically is done in addition to what an SSH/tmux stack is doing?

wez commented 1 year ago

@calops you can read the code if you're really interested. It's too much to explain here for a casual comment; it will take time I don't have and it will have only cost and no benefit to me to do so to someone that isn't committed to spending some of their time also working on this, and I don't believe that you are that person.

github-actions[bot] commented 1 year ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

© Githubissues.

Githubissues is a development platform for aggregating issues.