wez / wezterm

A GPU-accelerated cross-platform terminal emulator and multiplexer written by @wez and implemented in Rust
https://wezfurlong.org/wezterm/
Other
17.43k stars 787 forks source link

Allow very large scrollback #1342

Open junkblocker opened 2 years ago

junkblocker commented 2 years ago

Is your feature request related to a problem? Please describe.

I am surprised to find that wezterm holds the whole scrollback in memory and doesn't attempt to swap out to disk or even just use a circular buffer or something even in memory. This causes wezterm to run out of memory and get killed.

Describe the solution you'd like

Swap lines beyond the scrollback_lines out to disk still keeping them available via pane:get_lines_as_text(...) .

Describe alternatives you've considered

Reduce the scrollback_lines size which is inadequate since that then requires changing your tools/habits.

Additional context

.

cowens commented 1 year ago

The issues identified in https://github.com/microsoft/terminal/issues/1410#issuecomment-504942872 seem to be:

  1. implementation limitations on the size of the buffer and the integers pointing the to the current place in the buffer
  2. security of the file on disk

Looking at the code, it appears as if scrollback_size is already usize which has a max of 264 − 1 which is much better than Windows Terminal's use of 16 bit integers, so item 1 isn't much of an issue.

Item 2 is more of an issue, but if the file is owned by the user and the file system itself is encrypted (which it should be these days), then this is only an issue if an attacker has access to a logged in account with permission to the file. That is a level of risk I am willing to take on a single user machine (if an attacker has permission to the file, they can probably get the key from the wezterm and decrypt it anyway).

junkblocker commented 1 year ago

Just want to say that between this and the OOM kill related bug #3907 I keep needing to (remember to) switch to Konsole for memory heavy operations and that mars the experience/habit.

( Good to run into you on the interwebs, @cowens ! )

cowens commented 1 year ago

Looking a bit deeper, it looks like the practical limit to the scrollback size is actually constrained by isize not usize, so the maximum appears to be 263-1 not 264-1:

    pub fn stable_row_to_phys(&self, stable: StableRowIndex) -> Option<PhysRowIndex> {
        let idx = stable - self.stable_row_index_offset as isize;
        if idx < 0 || idx >= self.lines.len() as isize {
            // Index is no longer valid
            None
        } else {
            Some(idx as PhysRowIndex)
        }
    }

Sadly, I am not finding anything that is a drop-in replacement for std::collections::VecDeque that is backed by a file. If I have enough time, I am going to play around with implementing all of the methods term/src/screen.rs is using for the scrollback with a hybrid in-memory/on-disk data structure.

wez commented 1 year ago

If you're serious about contributing in this area, then here are some considerations:

1) When resizing the terminal, the scrollback is re-wrapped. If the scrollback is essentially unbounded in size this poses a real latency challenge during live resizing where several rewraps can occur within a fraction of a second. A disk-based storage solution should accommodate this somehow without requiring that the data on disk be physically rewritten. I've been thinking that deriving a mapping from the physical to the wrapped offsets would work. In other words, rather than physically rewriting the data, a rewrap would zip through the physical storage and derive the adjusted line dimensions to use as an index and adjusted line count. That index data could then be stored in an additional but smaller file. 2) To implement bounded but very large scrollback, I think there are at least two approaches:

I also think that compression (I suggest zstd) is a requirement. I consider encryption to be a requirement before making the disk storage a default configuration. I think both encryption and compression can be handled in a similar way as a "transparent" Read/Write impl that handles both of those concerns.

cowens commented 1 year ago

Yeah, my thought is for the scroll back to be a data structure that has a configurable in memory part and an on disk part. Each on disk part would be several physical screens worth of lines, as the scrollback nears the end of the of the currently cached on disk part, it would load the next (or prior depending on the scroll direction) disk part. I envision the data on disk to be the literal lines (encrypted and compressed) (i.e. unwrapped). There should also be an index file that tells you what the order of the files are. So when it reads in the on disk part, it would need to rewrap it to match the current physical screen. This does mean the files will be a random number of bytes long and really long logical lines will wind up being wrapped to many more physical lines than very short lines (which will likely effect the performance of the scrollback)

If the size of the scrollback isn't unbounded, then we can kill the oldest file once the counter of on disk lines exceeds the limit. If we get an ENOSPC, then we can start deleting files at the oldest file and continue until we can write the current set of lines moving out of memory.

I think compression and encryption can wait until after the PoC and the ability to a portion of the scrollback is nice 2.0 feature.