support for text folding

albertz commented 5 years ago

It would be nice to support text folding, i.e. via some escape codes surrounding some block of text, that you can fold that away. Similar to e.g. how Travis does it (example).

Some related links:

Tyriar commented 5 years ago

I don't think we'd want to support this unless the sequence is widely used by programs and/or somewhat standardized.

albertz commented 5 years ago

Well, unless there is a terminal emulator which introduces such feature, why would a program make use of it? So this is why a project like xterm.js would have to introduce such feature. This is how it works for all kinds of extended escape codes (see e.g. my links here).

I thought that xterm.js wants to be a modern terminal emulator. By your logic, first some other (widely used) emulator would have to introduce such a feature, then it has to become widely used by programs, and only then xterm.js would adopt it. Which means that xterm.js would become kind of unmodern by then.

I'm quite sure that there are apps depending on xterm.js which would like to have such a feature, e.g. Hyper.

egmontkob commented 5 years ago

There are a gazilliion of various terminal extensions out there. E.g. the linked DomTerm page lists maybe about 50 custom escape sequences. I think it's fair to leave it for each terminal emulator's developers to decide which one they adopt and which one they don't.

As a VTE developer lurking around here, I love the idea, but I really wish there had been some cooperation between some popular terminal emulators for the design.

It's unclear to me how the \e[16u and \e[17u escape sequences defining the arrows are coupled together with the \e[83;...u (*) ones, and what happens in cases when they don't arrive in the "expected" order (e.g. the 17 just doesn't come), or the cursor is moved in between (e.g. moved backwards between the opening and closing 83).

(*) The attached Java program doesn't print 83, only 16 and 17, so I probably don't understand something correctly. It's unclear to me what sequence specifies exactly the toggleable block.

I don't understand the 1 or 2 characters story between \e[16u and \e[17u, e.g. how is an emitting application supposed to know whether the terminal emulators knows the "show" counterpart of the specified "hide" character? Also, doesn't this construct introduce the first ever case in terminal emulation when a character outside of escape sequences does something else than prints itself? Wouldn't it lead to unforeseen troubles? I'm also wondering why there's a need for this, and whether it would be better to let the terminal emulator pick its own preferred graphical representation (not necessarily something that could be printed by the app, perhaps some stock UI gadget).

The choice of the \e[...u framework is a pretty unfortunate one due to its potential clash with SCORC, in a similar way that DECSLRM kinda-sorta conflicts with SCOSC (see e.g. VTE 48).

My overall feeling is that DomTerm suddenly wants to do a whole lot of things at once that no terminal emulator did before (I, for one, would argue that nesting isn't necessary, it's more than enough to implement collapsing on the outmost level for the vast majority of use cases, more complex ones should done by terminal-based or graphical applications rather than terminal emulators), and (impression based on the fresh comments at DomTerm 54) the way it does them is not really properly thought through and mature. It's just my feelings without having tried it or closely studying it, not a solid opinion.

Of course, only time can tell if a feature is going to be successful or not.

albertz commented 5 years ago

Yes, I agree with most of what you said (except the nesting: I think that this is useful and also simple, once you have that feature in any form).

This is why I created this issue: To discuss the possible options, and to ultimately get this feature (widely available in common terminal emulators).
I'm actual not too happy with DomTerms implementation. I just linked it as a reference that there is already a terminal emulator which implemented this. I was not suggesting that xterm.js should adopt the same escape code. I'm actually a bit confused about the exact definition of it myself (DomTerm 54). I tried to make use of it in an own app, but either I misunderstand how to use it, or it's buggy, or both (by trial and error, I have some working solution now here).
If there are people who want to have such a feature, someone has to come up with a suggestion for an escape code for this, and some terminal emulator has to implement this (and hopefully others will follow).
- And there are definitely people who are interested in such a feature: @albertz, @egmontkob, Travis, DomTerm, Hyper, Final Term, and probably many more.
- Existing solutions are probably not optimal:
  - DomTerm has one suggestion. But as discussed, probably not ideal.
  - Not sure about Final Term.
  - Travis solution is probably not the best solution for terminal emulators (no escape code, just a custom string).
- I.e. someone should come up with a clean definition. I was suggesting (implicitly by my issue here) that xterm.js could lead this. Because only if a widely used terminal emulator introduces such a definition and feature, it has a chance to get adopted.
I.e. closing this issue here seems a bit strange to me. Does that mean that xterm.js does not want to lead such a role? I would suggest to reopen this.

egmontkob commented 5 years ago

Nice screenshots of FinalTerm, but you know the project is discontinued?

albertz commented 5 years ago

Yes I know. But this was also only for reference. I actually found it in the code now (search for collapse_button and is_prompt_line), and it seems that there is no special escape code for collapsing/folding, just for marking the prompt line. I definitely do not suggest to implement it like that.

Tyriar commented 5 years ago

I'm also wondering why there's a need for this, and whether it would be better to let the terminal emulator pick its own preferred graphical representation (not necessarily something that could be printed by the app, perhaps some stock UI gadget).

To me it seems like generic folding doesn't seem to be that useful, that's why many programs have a verbose output option. What would be more useful imo is ways of flagging ranges, for example like iTerm's shell integration which lets the terminal know where the commands are, allowing a terminal emulator to fold the output if they wish. Something I wished there was earlier was the ability to flag a section of output with an alt text so screen readers read that instead of a graphical progress bar for example.

I.e. closing this issue here seems a bit strange to me. Does that mean that xterm.js does not want to lead such a role? I would suggest to reopen this.

Terminals supporting whatever they want seems like it's just going to lead to a more fragmented mess. I would hope in the future some form of standards body would arise to move terminals forward, as right now they seem somewhat stuck in time. I'm certainly open to being involved in a more coordinated effort though.

@egmontkob am I right in my assessment of the world here? Any insights here as I'm still relatively new to the scene 😄

albertz commented 5 years ago

I uploaded my use case for folding here (scroll down to the screenshot in the Readme), which is a Python package to print a Python stack trace, with extended information. You see an example there for the standard MacOSX Terminal, and then a screencast with DomTerm, which uses folding. I think this is pretty useful, at least for me (it provides a much greater comprehensive overview of the stack trace, while still providing the details if you want to see them). And this is also an example that nesting can be useful (again, at least for me).

Btw, also Travis supports nested folding.

egmontkob commented 5 years ago

I would hope in the future some form of standards body would arise to move terminals forward

Some of us are working right now on creating such a collaborative platform, expect an announcement/invitation soon :)

Tyriar commented 5 years ago

@egmontkob 👌

jerch commented 5 years ago

Intruiging idea, here are my first 2cents/questions:

Where to put the folding sign in the view representation? If an emulator wants to support this it needs some way to layout things so the user understands whats going on. A typical position for folding signs is another bar on the left side as seen by almost every GUI code editor. Since a terminal is all about text interfaces this raises the question whether it should be part of the text view itself or an outer GUI driven element (much like a scroll bar in turbo vision vs. a scrollbar on the terminal view). Being part of the text itself (seems DomTerm does that) has several issues - it will mess with the pty line editor, thus the slave prog will have to unset ICANON and control the output itself or the terminal would have to resize and indent the wraps. In fact nothing really gained here as this is already perfectly doable, just use a curses driven lib which deals with foldable paragraphs and does all the low level stuff for you (like insert/remove paragraphs and registering mouse/key actions). Doing the fold bar on a higher GUI level has issues as well - first its a waste of space if always on (and hardly being used). Second it introduces "outer world" into a primarily text driven env - how should a fullscreen text only emulator render this? Linux console? They would have to sacrifice the first text column space and put some signs there. This again begs for the question why this is not done by a slave side lib, if a prog really needs this. Nested folds (beside the question whether they are needed at all) raise the question how to represent the level. A simple solution would not deal with that and just put the fold signs in one column. Showing the level in a tree like thingy seems to be complete waste of space to me.
How to deal with folds when incoming or other typical actions like copy&paste? Shall the folded state be copyable? Is the incoming data always folded/unfolded/yet another sequence to define this?
How to deal with scrollbuffer here? What happens to truncated folds due to scrollbuffer limits? Shift the start marker downwards and remove it when it hits the corresponding end marker? This gets really funny for nested folds.

Those are only first surfacing thoughts/questions regarding a possible integration into what terminals do. Yet this "5m distance perspective" alone leads to cumbersome constellations, I think folding will only work with good user experience when properly spec'ed beforehand and implemented in similar ways across different emulators.

PerBothner commented 5 years ago

A "live" dome is here. This is actually a "Saved as HTML" snapshot, but wrapped in JavaScript so text folding and dynamic resize (pretty-printing) work on the snapshot, just as they would in a live terminal.

jerch commented 5 years ago

A bit more detailed this time - still a list of wild thoughts from my side with some proposals/ideas:

need for nestable folds This is likely to interfere with all other aspects and might be good to be clarified upfront.
visual representation My vote would go to a single column thingy (fold bar), the way it shows up should be left to the emulator's needs (either as text column or as a GUI sidebar). Imho it should not be part of the active terminal view as it will mess way to much with the pty. Another question is whether a folded part should show some collapse info in the terminal beside the fold sign in the bar, and what to be shown there. Things that come to my mind here:
1. default to truncated first line of the folded content, maybe with some additional folded "markup"
2. make it customizable through the start marker
Imho text attributes should be preserved on folded content. Same goes for more complicated actions during input (like OSC/DCS commands) - their results should stay in place if they are bound to parts of that terminal buffer.
accessibility We gonna need at least 3 new key combinations:
- jump to next fold
- jump to prev fold
- toggle fold
This not an easy task as the combinations should not be used by any other popular prog yet, and will be occupied prolly for the time being after being introduced. Mouse support should only be optional, as pure text driven emulators might not have mouse support at all. The mouse should not interfere with registered mouse protocols in the terminal view, thus its basically limited to the fold bar (maybe extend this optionally to the terminal view if no mouse protocols were registered by the slave prog).
behavior/integration with other terminal parts
- copy&paste behavior I would favour a WYSIWYG behaviour here, means peeking into folded terminal buffer would not contain the folded content. I am pretty sure there are more use cases for an "always unfolded" copy behavior, imho this needs to be discussed with users' expectations in mind. The start/end marks should not be part of the copied content as they are just style hints for a terminal.
- scrollbuffer interaction No clue yet myself how to deal with folds here, it basically boils down to the question whether to remove a fold at once or line by line from the scrollbuffer if the limit is hit. For this to work the scrollbuffer will have to be more clever than it used to be for most emulators. Not sure yet if a fold, that spans the "scrollbuffer - active view border" will introduce issues, being able to jump with the cursor into a fold region might screw up things (just a wild guess atm).
- cursor state / position in terminal buffer (fold in the active terminal view) Folds in the active terminal view need some additional definitions, as they might mess up the buffer state when used with common cursor sequences. A question that arises is where to allow setting fold marks or how to deal with them while the cursor is in the middle of a line. From the motivation of the fold idea it seems logical to only support folds on line level, thus a spec would have to cover this "faulty" input and propose some default action (like autowrap to next line on any fold marks). Furthermore after the start mark was set the cursor could jump and place the end mark above that line, this also needs some state recovery (like swap the closest marks). Additionally the cursor could span the marks over empty lines, the spec needs to tell whether those lines should treated as always collapsed or "realized" (maybe with line feeds).

Last but not least a halfway failsafe escape sequence should be found. This would have to deal with all sorts of faulty states like dangling marks and such. If going with the start/end mark thing this also raises the question how to deal with the open start mark while the end mark is not set yet.

jerch commented 5 years ago

@PerBothner Thx for the demo, looks pretty nice. Main question from my side - how do you deal with terminal size and the pty here? Modifying and formatting the data (inserting chars + autoindentation) without explicitly requested by the prog is way to much alteration of the original data for my taste.

PerBothner commented 5 years ago

"Main question from my side - how do you deal with terminal size and the pty here?"

The demo uses no pty and no server, except to serve static html, js, and css files.

The resizing/reflow all happens in the browser. It's similar to the reflow that some terminals (e.g. Gnome Terminal) do for wrapped lines - see issue #622. However, the application outputs markers into the output stream to mark structural elements. These are used to guide the line-breaking/re-flow - basically Lisp-style pretty-printing on-the-fly. These work even when the application (or the pty) is dead.

"Modifying and formatting the data (inserting chars + autoindentation) without explicitly requested by the prog is way to much alteration of the original data for my taste."

It is explicitly requested by program, using special escape sequence.

jerch commented 5 years ago

The demo uses no pty and no server, except to serve static html, js, and css files.

So this is not meant to run with a real pty and a shell?

PerBothner commented 5 years ago

"So this is not meant to run with a real pty and a shell?"

The folding/pretty-printing feature is definitely meant to be used with a pty and a shell. However, the demo does not use a real pty and shell. Think of it like an animated gif recording of an actual pty+shell session - but it's interactive.

albertz commented 5 years ago

My comments:

I would vote for having the folding buttons in the text itself. That makes it easier on every side (the terminal emulator does not have to introduce any specific UI area for this), and also gives more control/freedom for the shell or the tool which wants to make use of this.
I think that nesting is very useful, and also not really problematic to support, in any of the possible cases.
I'm actually fine with mouse-only support. How is this with other features like hyperlinks? But introducing a keyboard shortcut should also not be too problematic (maybe just like some kind of focus for all kind of buttons, including also hyperlinks).
Copy & paste behavior: This was already a bit discussed here. I think both cases (copy only visible text, or copy all the text) can make sense under certain circumstances. That is why think that there should be two separate escape keys, so that the app developer can chose what makes more sense for a particular use case.

jerch commented 5 years ago

I would vote for having the folding buttons in the text itself. That makes it easier on every side (the terminal emulator does not have to introduce any specific UI area for this), and also gives more control/freedom for the shell or the tool which wants to make use of this.

It would not. In default ICANON mode this would mess with the lines sent from the pty - the pty has a notion of the actual terminal size and might send extra control chars like '\r' when the end is reached, if the terminal decides to add chars on own behalf for wotever reason this will fail badly. This not a voting thing, its more about being technically feasible. Since you mentioned the shell - from the perspective how the responsibilities currently are shared between terminal and shell this feature would suit better to the shell than the terminal. Since we have no widespread shell doing it atm, I wonder if this is needed/highly requested at all.

PerBothner commented 5 years ago

An admitted problem with DomTerm's folding is that it isn't as clearly specified as I'd like. There are two main use-cases I'm focusing on, and the constraints are different:

Folding the output of a "command" in a shell or other REPL (along with input lines after the first). In this case we usually don't have the option of changing the shell internals, but we usually have the option of setting a prompt string that can be used to delimit commands, as well as distinguishing prompt, input, and output from each other. This is similar to some other terminals' "shell integration". In this case the fold button is in the prompt string, and the terminal uses the command delimiters to figure out what to hide.
A REPL that prints out some non-trivial data structure. For example the console of JavaScript debuggers in Chrome or Firefox. In this case nesting is obviously useful. When a fold button is pressed, DomTerm looks for "foldable sections" that are at the "same level" as the button. A foldable section can be delimited by an 83 escape code (see DomTerm spec) or a "logical pretty-printing block". The definition of "foldable section" needs better specification and documentation.

An enhancement of the latter is "lazy show". Some part of the output is hidden - and the application just sends a placeholder button, rather than the actual data. When the output is made visible, the terminal sends an escape sequence to application, which responds with a commands to update the newly-visible section of the output. This would be very useful for very large or "infinite" (cyclic) data structures. This is not implemented in DomTerm, but I have the outline of a protocol I can explain on request.

Not implemented, but @albertz has suggested/requested an option to specify an string name for both buttons and foldable sections: clicking on a named button flips all sections and buttons that have the same name. This enables one button to fold multiple related sections, even ones produced by no-longer-running applications. This is very general, but a bit more complicated for applications, so there should also be the simpler commands.

PerBothner commented 5 years ago

"It would not. In default ICANON mode this would mess with the lines sent from the pty - the pty has a notion of the actual terminal size and might send extra control chars like '\r' when the end is reached, if the terminal decides to add chars on own behalf for wotever reason this will fail badly. This not a voting thing, its more about being technically feasible."

I think you misunderstand, at least how DomTerm does it. The application explicitly requests where to put the fold buttons. When it comes to input lines, a properly-craft prompt string includes space for the prompt button, so the readline (or similar) library can calculate the correct spacing, without knowing that it's a hide button - it's just a random Unicode character. (This assumes that the prompt string syntax has a way to specify non-printing characters, of course.) When it comes to folding of output, ICANON is irrelevant.

You could even use a double-width Unicode character for the fold button in a prompt string, as long the input-editing library uses a suitable wcwidth implementation. In this case you need to make sure both hide and show character are the same width, but only include one of them in the "printing" part of the prompt string. (Or you can cheat: put two dummy single-column characters in the "printing" of the prompt, and override them with escape codes or a styling option.)

jerch commented 5 years ago

Yes, my bad ICANON will not affect this, its libs like libreadline that are affected by this for their line calculations. And the space trick would make the needed room. Thanks for clarification. Now what happens if that fold sign room ends up in one if the last line cells - how does the autoindentation deal with this?

PerBothner commented 5 years ago

"Now what happens if that fold sign room ends up in one if the last line cells - how does the autoindentation deal with this?"

Not sure what the concern is. The fold sign is just like any other printing character (perhaps styled differently) until you click on it. At which point it flips between the show/hide versions - and may show/hide subsequent text, which will force a re-flow for the affected lines.

There is no "autoindentation" per se. There are markers in the buffer, placed there by explicit escape sequences. The line-breaking algorithm may add or remove indentation (whitespace or special characters, as requested), depending on text folding and line width.

The line-breaking algorithm is a bit complicated (partly for performance reasons), so there could of course be bugs or under-specified corner cases.

PerBothner commented 5 years ago

In case it isn't clear: The demo isn't a pure text folding demo. It also makes use of an orthogonal feature: pretty-printing, which allows the application to specify line-breaking and indentation based on logical (application-specified) structure. All the auto-indentation is part of the pretty-printing feature set. While text folding is conceptually distinct from pretty-printing, they are of course designed to work well together.

sedwards2009 commented 5 years ago

I'm also the developer of a terminal emulator. It's goals are to extend and modernise the terminal environment with the kinds of features that albertz is talking about while at the same time remaining compatible with the applications in the terminal ecosystem.

Extraterm implements the "capturing command output" use case with its own shell integration which works via the shell prompts and/or pre-exec and post-exec hooks in the shell, (fish, bash and zsh). Command output is shown in a "frame" and separated from the surrounding text. You can also perform actions on the whole frame of output. Making it possible to fold it is on my TODO list.

I find this discussion quite interesting because I see the value in terminal applications being able to mark (nested) logical sections in their output as a hint to the terminal emulator.

Some quick thoughts:

Showing/hiding blocks of text should be done purely on the terminal side. If you need the remote application to be running to serve blocks of text via some protocol then you might as well just write an outline viewer using ncurses and implement all of the folding on the remote side.
This feature should not modify the terminal grid model by introducing extra characters or prompts or buttons which appear inside the character grid. This just makes it more complicated than it needs to be.
Although it is smart to consider how a terminal may implement the UI for which feature, any kind of spec should concentrate on semantics and not on whether an arrow is shown in the left side of the window.
It must be possible for the chosen escape sequence(s) to be ignored by terminals which don't understand it. The result is then the whole text in expanded form (i.e. nothing folded/hidden).
I find nesting or different levels to be useful. It is useful in text documents (i.e. heading level 1, heading level 2, heading level 3 etc), and I think it makes sense in other text output like software build tools.
Some way of marking a section as folded/hidden by default may be useful too. For example, the logging output of an application may hide log lines at info and debug severity levels by default.

I can see this feature working on a protocol level in a similar fashion as markdown headings. Each escape sequence marks the start of a level with a specified "depth". For example, imagine that the angle brackets represent the escape sequence in the VT stream:

This is text as the default level 0. This is text as the default level 0. This is text as the default level 0.

<level 1>The title of the level 1 block goes here
Text inside a level 1 block. Text inside a level 1 block. Text inside a level 1 block. 

<level 2>The title of the level 2 block goes here
Text inside a level 2 block. Text inside a level 2 block. Text inside a level 2 block. 

<level 2>Another block of text at level 2 with a title.
Text inside a level 2 block. Text inside a level 2 block. Text inside a level 2 block. 

<level 0>Back to the default level.

PerBothner commented 5 years ago

@sedwards2009: "This feature should not modify the terminal grid model by introducing extra characters or prompts or buttons which appear inside the character grid. This just makes it more complicated than it needs to be."

I disagree:

If buttons cannot be in the character grid, that means the terminal must allocate space for a "gutter".
It may be desirable to have fold buttons indented from the left column, like in the JavaScript consoles for Chrome or Firefox.
It may be desirable to fold sections of text that are not complete lines. For example you might print a nested list that fits all on one line when sections are hidden, but will require multiple lines when everything is visible. In that case it is better to have the fold buttons after some initial text. Try my demo and adjust the window to both very wide and very narrow.

jerch commented 5 years ago

If buttons cannot be in the character grid, that means the terminal must allocate space for a "gutter".

Thats imho a good thing as it does not mess with the terminal view content directly. I see tons of problems with having those "outer things" in an active terminal cell (c&p, cursor jump, ECH?). Most code editors do pretty well with the line limited collapsing (even nested) and still only need "one column" one the left side to offer/draw that functionality. Imho thats a pretty common interaction model, so it would meet people's expectations. (Btw since you have the fold sign on every collapsed line its pretty much the same waste of space)

PerBothner commented 5 years ago

@sedwards2009: "It must be possible for the chosen escape sequence(s) to be ignored by terminals which don't understand it." I don't think this is a requirement, but it is of course desirable. It means that all text that should be visible on terminals that don't understand the feature should be outside new escape sequences; all text or commands that should (only) be handled by a conforming terminal should in an OSC or similar command.

The main issue is for fold buttons that appear inline (in the "character grid") and appear in a context where the application is tracking column widths, such as a prompt for an input line editor. In that case the prompt must include in the non-OSC text something as wide as the fold button.

PerBothner commented 5 years ago

@jerch: "Btw since you have the fold sign on every collapsed line its pretty much the same waste of space" Not sure I understand what you mean by the "fold sign". For the most common use case of folding shell command output, DomTerm only requires a character cell for the fold button in the prompt text before the input line. All of the output lines are full width, with no special marking, and no cell reserved for folding. An application can add indentation in addition to the folding, but that is 100% up to the application.

jerch commented 5 years ago

"fold sign" --> "fold button", however one might call it...

Im talking about this column: grafik

Another reason why I dont like the inline folds sign is readability, the output gets somewhat scrambled by those triangles. Thats not the case with a separation into its own dedicated column.

PerBothner commented 5 years ago

@jerch: That "column" isn't collapsed lines - those are all single-line shell input lines (comment lines). Most of the time a shell command will have some output lines - and those are full-width. For example this (old) screenshot.

sedwards2009 commented 5 years ago

PerBothner identified two main use cases for this kind of functionality, 1) folding command output, and 2) output of foldable tree data structures, i.e. from a REPL.

I think there is a 3rd distinct use case here and that is the example of Travis output as mentioned by albertz at the top of this issue. This is basically support for foldable "documents" where complete blocks of text are shown/hidden. Somewhat like the "outline view" in some document/text editors works. This use case is less demanding than 2) and wouldn't require "fold signs" to appear in the character grid, nor would it need to support fold areas inside lines.

Use case 1) is basically what iTerm2 and Extraterm's shell integration does. It's requirements are rather different than 2) and 3) and it should be handled separately from folding. (For example, Extraterm's shell integration sends extra data like the command being started and also things like the return code of the last command.)

Use case 3) "Travis", is what interests me the most because it would be very useful and has clear and obvious applications. Also if the escape codes are designed carefully then they could be safely used in the wild without causing havoc for people who use an terminal emulator which doesn't understand the new codes. This would increase the chances of terminal applications actually supporting and using the new fold codes.

I consider the solutions proposed so far for use case 2) to be too invasive and complicated. It extends the VT character grid model too far. I do like the idea of viewing different types of data inside a command line environment, such as a tree data structure. But in Extraterm at least I would support that by letting the remote end transmit a file with mimetype "text/json" etc and using a custom viewer at the terminal end to show it. I wouldn't try to shoehorn this into the VT character grid model.

Before we get bogged down in technical details we should at least agree on which use case we want to support and which ones are out-of-scope.

jerch commented 5 years ago

We should not forget that the tty/terminal is primarily for streamlined things, and does this pretty good with carefully crafted enhancements like color support. Its the whole success story of this oldish interface. Anything that goes into a document driven direction imho belongs to a dedicated prog that handles this particular format. And for complicated data representation there is still the alt buffer and the curses lib.

Therefore I think any folding support should primarily focus on the streamline side and keep things simple:

no nested folds - easy to comprehend for application devs / easy to comprehend and interact with by users later on
folds only on line level - no mess with the grid/terminal view, still capable to do things like command collapsing if supported by the shell/prog currently used

I agree that a spec should not focus on how to represent things to the user, still I think it should be considered during spec'ing and maybe give some recommendations to get a concordant behavior across the emulators that want to implement it. And ofc the whole folding story must be optional, thus the escape sequences must be crafted in a way, that the text flow/readability does not suffer if an emulator does not support it.

egmontkob commented 5 years ago

I've been thinking on it for a few days now (sorry, I've been busy to comment), and while first I was enthusiastic and the live demo indeed looks cool, having thought about it I'm much more on @jerch's and @sedwards2009's side.

While the demo is indeed cool, it is equally useful? Will users actually bother to click on those arrows to fold/unfold, will it save them a noticeable amount of time?

Both the functionality and the visual look (e.g. with the vertical, horizontal, angled and T-shaped line drawing characters) is quite arbitrary. Someone might come up with a somewhat different foldable treeview look and behavior, and question why it's not that one the terminals go for.

It's limited to one-time view, whereas with a decicated ncurses app one could, let's say, save and restore the folded state.

It's limited to a treeview, but there can be other similar things to consider, e.g. to place a long line of text inside a horizontally scrollable area, or have a slider for a number that specifies how much to fold, etc.

The terminal emulator is not a graphical UI toolkit, not something that offers widgets like a treeview, and IMO it shouldn't. This is the wrong level to solve this issue, the right level would be a dedicated application using ncurses or whatever similar (or a graphical app). It's magnitudes easier to solve it there (e.g. you get treeviews out of the box in real graphical toolkits), and the work doesn't have to be repeated across dozens of terminal emulators out there. (Mind you, being familiar with VTE's internals, I can tell for sure that it's a freaking huge amount of work to implement anything like this there.)

The current proposal tries to push interactive behavior from the app to the terminal, and this isn't a direction we should be taking. Defining and implementing the behavior belongs to the app; displaying the outcome of that behavior is the terminal's job.

With traditional terminal emulators, you always have to think: what happens if the cursor is repositioned to a given absolute location and data is overwritten there. What would happen here with the treeview feature? Would the absolute coordinate jump to a different logical place depending on whether the user has folded a section? Or would folding remain a view-only, and the terminal emulation always work on the unfolded layout?

Will terminal emulators and text viewers likely implement this feature? Will it work inside less? Inside screen or tmux?

I'm trying to compare this feature to explicit hyperlink support, regarding its estimated usefulness, cleanliness, amount of required to support it, likely adoptation etc. And it doesn't look good. Mind you, the explicit hyperlink feature isn't as popular as I deeply in my heart hoped it would be. less's author tracks it as a feature request, and my proof-of-concept demo no longer applies to the newest versions. tmux's author has rejected it for the time being. The explicit hyperlink feature, in my opinion, serves a clean goal, is useful in quite some use cases, and there's nothing arbitrary in its user facing side. Compared to that, from this foldable thingy I'd expect an even lower enthusiasm from people to adopt it, both from the terminal and from the application side, and I see a smaller number of use cases where an app could actually use this feature. Plus, this is combined with any decision being arbitrary, debatable, and if someone doesn't agree then they'll likely go with ncurses or a graphical app anyway, or propose further extensions to the already complicated and arbitrary terminal feature. I don't see the required efforts being worth it.

IMO let's not turn terminal emulators into more powerful UI toolkits.

What I believe could be a better approach, in multiple aspects, is to invent escape sequences that add semantics to the data stream. (This is I believe what e.g. shell integration in iTerm2 already does.) Such semantics could be the nested level. And then various terminal emulators could each experiment with ways of offering folding/unfolding, or quick jumping to the next/previous, or such, resulting in a healthy competition among terminals for the users' benefits.

About whether nesting is required or not: I'm still not sure. Let's assume that the default nesting level is the level of a single command (as e.g. seen in the top 5 lines of the DomTerm demo). It's a nice extension to tell the terminal emulator where a command's output starts and ends, and a terminal might make them foldable. But there could be a desire to have other indentation levels, both outside and inside. Inside a single command, one could define further indentation levels as seen in the DomTerm demo, or the output of any recursive command (like make, find, ls -R) could place such semantic marks. Outside, there could be levels for entire shells, ssh sessions etc.

It's unclear to me at this point whether escape sequences could define such levels in a safe, robust way, that is, allow all kinds of nesting combinations (e.g. java printNested inside ssh inside ssh inside zsh inside bash inside ssh...), in a way that recovers after an app crashes.

Tyriar commented 5 years ago

What I believe could be a better approach, in multiple aspects, is to invent escape sequences that add semantics to the data stream.

+💯, this sums up my feelings from this discussion. I implemented "command tracking" in vscode but right now it's just guessing what a command is and is very hit or miss. The lack of this semantic information is imo the main blocking factor that prevents taking terminals to the next level.

PerBothner commented 5 years ago

"What I believe could be a better approach, in multiple aspects, is to invent escape sequences that add semantics to the data stream."

DomTerm has escape sequences to delimit prompt, input and error text. (Output is presumed to be the rest.) See the CSI 12u, 11u, 14u, 13u, 18u, 15u in the linked-to documentation.

There are escape escape sequences to delimit command groups, which may be nested. See the osc 119, 120, 121 commands. See here for suggestions how to use these escapes. The basic usage is that each prompt specifies a 119 escape to indicate start of new group, and implicitly the end of an existing group with the same id. To support nested command groups, specify a group-id. This could be a process id for the shell. Optionally send enter/exit group commands when you start/end a shell, but just using the 119 escape with group-id seems to handle nesting fairly well, without explicit enter/exit commands.

It would be great if we could standardize these or similar commands.

sedwards2009 commented 5 years ago

Extraterm also has some custom escape sequences too. I've designed the sequences with a degree of security paranoia in mind. Each sequence contains a cryptographicly secure random "cookie" parameter. Each terminal tab in Extraterm requires a different cookie before it will accept the escape sequence. The cookie is available via an environment variable.

A remote application can use the escape sequences because it can read the cookie from the environment variable. But using cat to show the contents of a log file, for example, is still safe because it can't use or trigger the escape sequences. The log file contents won't have the cookie.

We trust applications, but we distrust any "data" which has found its way into the VT stream.

egmontkob commented 5 years ago

with a degree of security paranoia in mind

This sounds like pure and pointless paranoia to me. There are tons of other escape sequences that can occur in a text file and can mess with your terminal in various ways, e.g. trigger some responses as if they were typed from the keyboard, wipe the scrollback buffer, switch to weird character encoding, switch to invisible letters, invisible cursor and an unreadable color palette etc. These can all mess up the terminal, but are otherwise harmless. Incorrectly defining semantical blocks, e.g. the location of the prompt, isn't any more serious than these. (Some others, not implemented by many terminal emulators due to concerns, can even do more risky things like initiate a resize or move of the terminal window, set the clipboard contents etc.) It's a user error to cat (and not cat -v) an untrusted file and expect it never to mess up the terminal. In the mean time, if it's a log file, I'd argue that it's also an error in the producer of the log file if it can contain escape sequences; a log file shouldn't be untrusted data.

sedwards2009 commented 5 years ago

Yes, I am aware that there are heaps of ways of messing up a terminal and making it unusable. I'm not really worried about things getting "messed up". I'm concerned about more serious security problems which could creep in as I add more features and escape sequences.

It's a user error

Allowing data dumped straight to the terminal to have access to potentially dangerous escape sequences is stupid when it can be avoided. There is no up-side to allowing this. There is no use case here. Building traps for the user and then blaming them when it goes wrong by calling it "a user error", is a horrid approach to security.

Err on the side of security first, and not the other way round is my advice.

egmontkob commented 5 years ago

There is no up-side to allowing this.

To make them work seamlessly across ssh, across su/sudo, inside detached and reattached screen/tmux. To make them recordable and replayable using script/scriptreplay. To make them redirectable to another terminal...

Security is important, but if it unconditionally triumps everything then I doubt you can end up with anything usable. You can't password-protect every command that let's say changes the color, or defines a foldable section, or whatever.

New escape sequences, new features should be added with care, and if in doubt with its security or privacy implications then rejected.

My advice is to stick to existing practice, treat security with common sense and not confuse it with paranoia.

Anyway, it's getting pretty off-topic...

PerBothner commented 5 years ago

@egmontkob: "While the demo is indeed cool, it is equally useful? Will users actually bother to click on those arrows to fold/unfold, will it save them a noticeable amount of time?"

Well, this feature appears to be common and used in debuggers - at least JavaScript debuggers. A goal for DomTerm is not just a solid terminal emulators, but also a toolkit for REPLs. Like some of the kinds of things people use Jupyter for. Output with images, mathematics, rich text. I'd live to be able to seamlessly switch between that kind of "REPL toolkit" and xterm-compatible terminal, and mix-and-match their features.

"I consider the solutions proposed so far for [foldable tree data structures] to be too invasive and complicated."

I disagree - but it may be too complicated and esoteric to attempt to standardize it at this point.

However, it seems worthwhile to standardize a way to delimit prompts, inputs, and commands (possibly with nesting), in a way that can be put in a prompt string. Some terminals add an indication in a gutter area. Other terminals could add appropriate actions context (right-click) menu when the mouse is in the prompt area, such as folding

jerch commented 5 years ago

Yeah the semantic idea looks most promising to me, it would make it possible to deal with a bunch of purposes at once, yet let the emulators deal with the "how", which could range from ignore to present the data in a super special fancy way. Also competition in this field is good as emulators might try different solutions. Btw we already have some semantic escape sequences in OSC like the title thing, I would also account egmont's URL proposal in this field. This might already give some hints on how to layout things on escape sequence level, discussing it here might be beyond this thread.

About security (slightly offtopic) - I dont see a higher security risk per se from a "semantic terminal", as it still works on the common unix rights system/privilegies and as long as it is only dealing with data representation. Problems might arise though by tricking the user to do unwanted things (see our first and hopefully last CVE, or search for malicious escape sequences). For semantic additions, that deal with hiding content (like folding), this might be a problem:

#> echo "Hi <hide>" && maliciousCommandB && echo "</hide> there!"

If <hide> is an OSC sequence the casual terminal user can hardly tell anymore, if there is something fishy going on. Are we entering this world of security issues with it? Imho we already have this problem, many webpages propagate their super-dooper cmdline toolset via:

#> wget .... | bash
or really frightening:
#> wget ... | sudo bash

and no one really inspects that downloaded stuff beforehand, lol.

Edit: Btw the URL proposal has the same tricking peeps issue, but it does not harm anyone until the URL might get called (leaving the data representation only field). Thus a semantic addition that carries actions beyond the visual things might need extra security countermeasures.

egmontkob commented 5 years ago

> echo "Hi " && maliciousCommandB && echo " there!"

I don't think a semantical addition should be able to instruct the terminal to hide the contents. A semantical addition could inform the terminal: this is a prompt, this is the command entered after the prompt, etc. IMO they should all show up by default as before.

If the terminal offers a way to collapse a utility's output, and the user does so, there are still plenty of possibilities. Highlighting could copy the shown parts only. The emulator could auto-show a block if the contents within changes. Terminals could present a popup if the user copies hidden text. And so on. It's of course wise to think about these, and if we have a proposal, document the possible security issues we can foresee with various UI ideas that terminals might implement based on these sequences.

wget .... | bash [...] no one really inspects that downloaded stuff beforehand, lol

Same goes for when you download or git clone a repository and compile/install/run it.

URL proposal [...] does not harm anyone until the URL might get called

In the comment section of that proposal I argue that they shouldn't even cause harm if the URL is clicked, but IMO let's not derail this thread, this should be discussed over there if there are any remaining concerns.

jerch commented 5 years ago

I don't think a semantical addition should be able to instruct the terminal to hide the contents

Yeah its orthogonal to the semantic thing, it still belongs to the folding idea this thread was started with.

Should we move the semantic discussion to another thread?

albertz commented 5 years ago

Independent whether this is just about semantics or specific about folding: keep also in mind that there are both use cases where you want to hide the text by default (see eg Travis, or my better exchook use case), or where you want to show it by default (eg any command output). You could have two separate escape codes for both cases.

sedwards2009 commented 5 years ago

@egmontkob

Security

I think you are overestimating the impact of the scheme I described. I use this daily in Extraterm for a number of features and it works across sudo and ssh.

But won't work through screen/tmux without changes to those apps, but that applies to almost every new escape sequence we could come up with.

I grant you that most escape sequence additions are likely to be benign and not require this kind of security approach. But at the same time I've already got some extra sequences in my terminal project which I wouldn't feel remotely comfortable having available to untrusted data.

sedwards2009 commented 5 years ago

@PerBothner

REPLs... Output with images, mathematics, rich text

I'm also interested in these use cases. The approach I'm leaning towards is allowing application to send data in common formats (jpg, png, svg, etc) and display them in between VT output, effectively stopping the grid, inserting the image/whatever, and then creating a new empty grid with position (1,1) immediately under the outputted image.

sedwards2009 commented 5 years ago

Should we move the semantic discussion to another thread?

Yes please. Some kind of general mechanism for associating data and metadata (URLs, file paths, hostnames, git branches, etc) with things in the VT stream could be very useful. It would also be a huge discussion too.

egmontkob commented 5 years ago

@albertz

where you want to hide the text by default

I'd argue that this goes beyond the scope of defining semantics into defining the desired behavior, and as per my previous comments, I'd rather see this responsibility and functionality remaining at custom applications and not being pushed to terminal emulators.

PerBothner commented 5 years ago

@egmontkob : "I'd rather see [data structure folding] remaining at custom applications and not being pushed to terminal emulators."

A custom application running in a terminal using an ncurses-style library can't fold text except within the visible (non-scrolled) part of the output. I don't believe there is any xterm escape sequences to manage scrolling or navigate above the "home" line. Of course an application can repaint scrolled-out output, but it's not integrated with the scrollbar. It might be interesting to design a protocol to deal with scrollbars and large scroll regions, but I haven't seen anything like that.

@egmontkob : "The terminal emulator is not a graphical UI toolkit, not something that offers widgets like a treeview, and IMO it shouldn't. This is the wrong level to solve this issue, the right level would be a dedicated application using ncurses or whatever similar (or a graphical app)."

One way to use DomTerm is as a high-level GUI toolkit for writing rich REPL consoles. You call the toolkit using escape sequences rather than procedure calls. This is much easier and powerful that using a ncurses-style library (if you're implementing a rich REPL console, not necessarily for oher things you might use ncurses for). Plus you have the same programing interface for rich and plain terminals, just with a downgraded UI for the latter.

xtermjs / xterm.js

support for text folding #1875

> echo "Hi " && maliciousCommandB && echo " there!"