gnome-terminator / terminator

multiple GNOME terminals in one window
https://gnome-terminator.org
GNU General Public License v2.0
2.14k stars 257 forks source link

Slow Resize #825

Closed 0xFFFC0000 closed 11 months ago

0xFFFC0000 commented 1 year ago

Have a terminal with more than 1,000,000 lines of output. Increase the font size with ctrl++. It is going to block. This should be at least async. Not blocking.

egmontkob commented 1 year ago

Hello,

I'm not a Terminator developer, but I'm a former VTE developer who implemented rewrapping the contents on resize. (VTE is the terminal emulator widget used by Terminator and quite a few others.)

Increasing the font size is only slow if the previous number of columns no longer fit on the screen and the terminal has to become narrower (in terms of character columns). So, effectively we're talking about a resize operation rather than a font size change. As you have probably already guessed, resizing is slow because it has to go through the entire scrollback history and rewrap the lines according to the new width. The larger the scrollback, the longer it takes. And yes, as you've noticed, this is done synchronously, that is, no input is processed and the display isn't updated either during this time.

Rewrapping was implemented 10 years ago, and surprisingly, this is the first report I'm encountering that with giant scrollback it's slow, I expected complaints to arrive sooner. :)


Something that is less obvious:

Rewrapping is much faster for paragraphs (consecutive lines that wrap together) that consist purely of plain ASCII 32..126 characters; that is, no TABs, no non-English letters and other special symbols; and is even somewhat faster for giant portions of the data (hundreds of kilobytes) that are entirely of this nature. Moreover, having fewer or ideally no attribute changes at all (no colors, bold, etc.) can also speed things up. We're easily talking about a 5x overall speedup if all circumstances are ideal.

This is due to how the scrollback is stored internally. Without going into the details too much: When rewrapping, the text and attributes storage doesn't need to be modified, what needs to be updated is for each visual row an indexing directly into the text and attributes where this row's data begins. If a paragraph consists of ASCII 32..126 characters only (we keep track of this property) then we take a special faster code path: we immediately know which byte will belong to which new character cell, and we can perform the rewrapping (i.e. calculate the new indexes) without looking at the actual text. If there's anything fancier involved, i.e. either characters that are expressed as multi-byte UTF-8 sequences or characters that occupy multiple cells on the screen then we have to take the generic, slower code path: walk through the text, character by character, to find the new line beginnings. The text is stored on disk by splitting it into 64 kB chunks, then compressing and encrypting each chunk separately. If an entire chunk consists of only ASCII 32..126 characters then we even save the cost of reading it from disk, decrypting and decompressing.

I am right now pondering about various ways of coming up with other special cases that could be handled by a special, faster code path. Let's say, we could store the actual width of each line, and then we'd know upfront if a given paragraph will fit in a single line after rewrapping, this could save the cost of reading and processing its text character by character. Or it would be great if we could somehow have a faster branch for paragraphs that contain single-width glyphs only.


Could you help me please here? I'd like to understand the characteristics of your scrollback data, to know if you already benefit from some speedup of the special cases or not yet, and whether you'd benefit from the speedup of various rough ideas that I have currently.

What does your data look like? Is is typically in the ASCII 32..126 range, that is, plain English, and without the TAB character? Or do you frequently have TABs? Do you frequently have single-width non-ASCII characters (e.g. accented Latin characters, fancy quote signs, etc.)? How frequently do you estimate you have a character that's outside of the plain English ASCII 32..126 range? Do you frequently have double-width characters (such as CJK, emojis)? How frequently do you have attribute changes (color, bold, italic, underline etc.)?

How long are your paragraphs (logical lines)? Do they typically fit in a single visual line before the font incrase? And after? Or are they wrapped into maybe like 2-3-4 visual lines? Or even more? Or do you perhaps typically have extremely long paragraphs (let's say enormous auto-generated XML files with not a single newline within them, dumped to the terminal)?


You make a point that resizing should asynchronous, not blocking.

In an ideal world, yes. But the thing is: The rewrapping code is already quite complicated, and making it asynchronous sounds like a nightmare programming task that would make it at least a magnitude more complicated. Let me give you just a few reasons.

Assuming that the terminal is scrolled to its default bottom position, the code would have to immediately rewrap that, and then, while processing new input and updating the user interface, it would have to asynchronously rewrap the scrollback, and finally stitch the two together correctly.

The terminal is not necessarily scrolled there, though, it could be scrolled to let's say somewhere in the middle of the scrollback buffer. It would immediately need to rewrap the last several lines (in order to correctly process new incoming data and perform the terminal emulation there according to the new size) and also immediately rewrap the lines around the current viewport.

If the user scrolls, either in small steps (e.g. Shift+PageUp/Dn) or by large steps (dragging the scrollbar) then this new viewport should also be immediately rewrapped in order to display it (the background rewrapping task may or may not have reached this point yet).

It would have to handle correctly if a subsequent resize event (or many of them) arrives while background rewrapping is in progress.

It would have to handle correctly if newly processed data causes the contents to scroll, potentially almost as quickly, or maybe at an even faster rate than asynchronous rewrapping can be performed.

The latter two combined, we could quickly arrive at an intermittent situation where the scrollback is a mixture of not just two but many different terminal widths which needs to be tracked correctly and eventually wrapped to the same final width.

It would have to handle reasonably well that the scrollbar's position is not exact, it's only a rough estimate while background asynchronous rewrapping is being performed, since we can't know until the end of the entire operation how many lines there will be exactly and where exactly the currently displayed ones reside in that range.

All this should be combined with the fact that it's very hard to debug, very hard to track down any issue while developing. Due to the asynchronous nature itself, if a sequence of steps causes a crash (which is absolutely inevitable to happen many times during such a development process) then next time the same steps will likely complete just fine. It would often be very hard to reproduce and understand a crash that either the developer experiences or a user reports.

My very rough (and perhaps very wrong) estimate is that coding this up would take me perhaps the equivalent of 2-4 months of full time work, which would be mentally exhausting and would drive me crazy many times. The result would be a much more complicated code than the current one. And since that code has to live on, has to be maintained forever, I'm not sure if upstream would accept the change. The benefit it would bring isn't worth the effort, nor the resulting complexity.

Ideally, yes, I fully agree with you, resizing should be asynchronous. In practice, it's just absolutely not worth it.


I'll keep thinking about lowhanging fruit improvements that may (or may not, depending on your usage pattern) bring you noticeable speed improvements (while still remaining synchronous).

In the mean time, I also suggest you to reevalute your workflow and see if you really need this big of a scrollback buffer. Such amount of data cannot be processed by humans when it's first printed, cannot reasonably be examined by scrolling either, and the terminal's search facility probably isn't ideal either. Can't you live with a scrollback buffer of about 100,000 lines? Or maybe pipe your verbose apps' outputs to less? Or maybe even also automatically log each session in a file, using script, to be examined with more powerful tools if needed? Or how about making your apps significantly less verbose?

Yet another alternative: Even though VTE deprecated the non-rewrapping resize behavior, it hasn't been removed yet. However, Terminator removed the corresponding option in #303. You can patch your Terminator to bring it back and disable rewrapping (although in my opinion rewrapping is definitely worth the short wait).

Or maybe you can pick a single preferred font size and stick with that (maybe the bigger one straight from the beginning). Or use a desktop environment which does not confine the window into the display, rather allows it to stick out if the app requests a bigger size. Or, of course, just accept the occasional short wait.

Unfortunately, at this moment you can't have a giant scrollback, rewrapping on resize, and instantenous resizing experience all at the same time (my guess is that no terminal emulator out there offers this), you have to make a compromise somewhere, and this constraint is extremely unlikely to get eliminated any time in the foreseeable future.

0xFFFC0000 commented 1 year ago

I am familiar with how complicated software design works, I see what you saying.

I believe there should be a way to stop the operation or at least warn the user. The problem is when you have a huge output, and you hit ctrl++ (for any reason, mistake, not knowing it is going to take long time, or any other), it should no block immediately. Warning would be good before start of operation, or something like canceling the operation.

mattrose commented 1 year ago

This is why the default scrollback is 500 lines. With that number of lines, the behaviour is smooth and quick. if you set your scrollback buffer to something exponentially larger, then you can expect some sharp edges.

@egmontkob, good to see you!

0xFFFC0000 commented 1 year ago

And that is hallmark of bad software design :)

egmontkob commented 1 year ago

@0xFFFC0000 I think I gave you a pretty decent overview why the feature you're looking for is extremely hard to implement.

In response, you didn't bother to answer my questions, you didn't bother to help me understand what kinds of optimizations would you benefit from. I already have a change prepared which in certain more circumstances replaces the generic slow code path with a special fast one, providing a ~5x speedup in those cases; however, makes the already existing fast code path somewhat (about 20%) slower. It's probably still an overall win, however, I'm yet to measure the average long term performance impact on my system.

You also didn't react at all to my ideas about reconsidering your workflow, and I still don't have a picture about how a scrollback buffer of more than 1M lines can be actually useful in your workflow.

Instead of answering my questions, you came up with some other ideas. I was planning to continue the brainstorming with you, and elaborate why I don't really like your ideas, however, what other ideas occurred to me thanks to those.

Your reaction to Matt's comments made me change my mind. I'm not going to do this.

If you don't accept that processing larger amount of data takes proportionally larger time, and thus processing extreme amounts of data might take a bit annoying amount of time; and all your reaction to this is a thumbs down and calling it bad design, then it's not worth my free volunteer time to try to help you.

By the way, go to a YouTube video in your favorite browser (I've tested with Chrome and Firefox), load a few thousands (not millions, just thousands!) of comments, then resize the window. Is it asynchronous, is it seemingly instantaneous? Hell no. It's blocking, just like in VTE. So is it a sign of bad software design in all these popular browsers??

Terminator and VTE are the product of only a few volunteer developers throughout the years. Like no software out there that's significantly more complicated than a "hello world", these are not perfect either. Compromises had to be made; compromises would even have to be made if these were highly staffed projects with highly paid full time developers, and even if you had to pay for this software.

If you can't accept to live with such compromises then you are more than welcome to show us in details how a good design would look like, you are more than welcome to implement it as well. You are also more than welcome to help the change happen via other means. (For example, I can implement the asynchronous rewrapping; if you check my previous contributions to VTE, GNOME Terminal (and actually Terminator itself, too) you should have no doubt about it that I can do it in a high quality. Right now I am available for hire.)

@mattrose Hi there! :) I have a bit more time on my hands nowadays, so I'm taking a look at what's going on around VTE. But I won't be re-joining the project and volunteer heavy developments like I used to do.

0xFFFC0000 commented 1 year ago

@egmontkob Let me clarify few points:

1) I believe @mattrose deserves an extremely negative reaction because of closing an ongoing discussion. Which is extremely rude and shows how amateur he is in managing a professional software.

2) I didn't react directly to your ideas because I am not familiar with the internal process of VTK. As simple as that. End users want a better result. And as responsible end-user I have to report problems. Which with attitude I see from you guys, it better to go back and spend my time on contributing to LLVM, instead of helping you.

3) The slow down does not strictly happen with 1M lines of log, the system I use with 128Gb ram is extremely powerful. Ordinary *nix user does not have a system with that resource. A professional team of software developers would try to find out the bottleneck and optimize if possible. Instead of jumping and closing an issue.

4) It seems you are not familiar with modern paradigm in software design and HPC. Yes processing bulk of data is hard, but the art is to do it without negative impact on users experience. You can for example look at Servo engine which wants to provide highly parallel engine, something that Blink cannot do. (If you are familiar with Servo or Blink at all)

5) With this attitude I see from Temrinator development team, I am positive I am not going to use it again, and as ex-FAANG software engineer who is teaching in academia these days, never would allow any of my students to use it.

Goodluck with your software.

egmontkob commented 1 year ago

@0xFFFC0000

I didn't react directly to your ideas because I am not familiar with the internal process of VTK.

I asked a few questions that can be answered without having to know anything about VTE (not VTK) and which would have helped me know whether the direction I started working on would likely make the situation better for you or not.

It seems you are not familiar with modern paradigm in software design

It seems you haven't grasped the concept yet that software doesn't grow on trees (at least not yet).

I can implement the feature you've requested. But it would take me a long time, it would be a hard, exhausting task. Even if I had plenty of time and enthusiasm to do tons of volunteer work on VTE (which I used to, but no longer do), spending it on this very issue would be time not well spent; spending it on other areas of VTE could have a much higher impact.

I don't think I need to apologize for not having the required time and motivation to do this (or rather: any) volunteer work. But I'm happy to consider taking on this challenge as a properly paid task.

never would allow any of my students to use it

If you wouldn't recommend it, I couldn't care less. But do you seriously believe you are in the position to straight disallow them to use it?

"I picked a good, but not perfect piece of software. Asked its developers to improve a shortcoming. The developers explained why it's an extraordinarily hard task for marginal benefits, and why they are unlikely to have time to voluntarily address it any time soon, if at all. Therefore you must not use this piece of software, if I catch you using it then you'll fail the course!" Something like this?

If so, I'm pretty sure I would hate to have a teacher like you.

But at least I'm happy to know that apparently at your FAANG company you quickly fixed all the issues filed against your product and never had to make any compromise, never had to reject a feature request or leave a component in a not fully ideal state.

Greetings from another ex-FAANG!

mattrose commented 1 year ago

I can leave this open for you if you'd like that, but I think it's dishonest on my part. To be blunt, "blazingly fast" has never been the primary goal of terminator. It's a very good terminal emulator, with lots of features that can't be found in any other terminal emulator. I closed this issue because right now for a number of reasons.

  1. This is not really a terminator issue, but a vte issue.
  2. I'm currently the only dedicated maintainer. That's not to say that there haven't been fixes made by a bunch of other people, but they open PRs at the same time as they open issues, and currently the performance of terminator is more than acceptable to me. My time is extremely limited, and I get to decide how I spend my time, not you.
  3. Right now, even if I were interested in performance improvements, there are honestly a bunch of other issues that would take priority.

Given all this, I thought it would be fairer and clearer to close the issue, rather than leave an issue that would not get fixed in the foreseeable future.

I've been a professional software developer and systems administrator for nearly 30 years. Now personally, I tend to think of "professional" as a bad term for passive-aggressive and impersonal, but also personally, I don't think it's very professional to insult a project and it's maintainers because you didn't agree with a decision they took, but that's just me.

if you want a termnal emulator that prioritizes performance, then I can recommend kitty, or alacritty. Please, use either one of them.

I turned down an offer from a FAANG once, does that count?

egmontkob commented 1 year ago

if you want a termnal emulator that prioritizes performance, then I can recommend kitty, or alacritty. Please, use either one of them.

I don't want to brag about VTE, and even more so I don't want to bash other terminals, but...

I've been using VTE's rewrapping code for 10 years now, and I also keep an eye on many VTE-related forums (bugtrackers of quite a few VTE-based terminals, terminal-related questions on several StackExchange forums, etc.). Based on these I can say that I'm not aware of any functional bug in the rewrapping code (apart from it blocking for a short while in case of extreme amount of data, and apart from what seems to be a rewrapping glitch even though it's just bash reprinting the last prompt incorrectly).

I've tried the two emulators that you recommended. Produced some output, then resized crazily for about 2-3 seconds. One of them worked correctly, the other one (I'm not saying which one, it doesn't matter) resulted in data corruption. Started from scratch again, again obvious data corruption in just a few seconds.

What results in a more effective workflow: A faster app that often produces broken results that the user has to deal with, or a slower app that the user can absolutely rely on?

One of those two terminals I would definitely not recommend.

And the irony: Just as I was typing this very comment here, my browser froze to death and I had to restart typing this comment, resulting in a loss of maybe 2-3 minutes.

But sure a few seconds of lag in VTE is such a big deal... (sigh)