xtermjs / xterm.js

A terminal for the web
https://xtermjs.org/
MIT License
17.46k stars 1.62k forks source link

support for text folding #1875

Closed albertz closed 4 years ago

albertz commented 5 years ago

It would be nice to support text folding, i.e. via some escape codes surrounding some block of text, that you can fold that away. Similar to e.g. how Travis does it (example).

Some related links:

Tyriar commented 5 years ago

I don't think we'd want to support this unless the sequence is widely used by programs and/or somewhat standardized.

albertz commented 5 years ago

Well, unless there is a terminal emulator which introduces such feature, why would a program make use of it? So this is why a project like xterm.js would have to introduce such feature. This is how it works for all kinds of extended escape codes (see e.g. my links here).

I thought that xterm.js wants to be a modern terminal emulator. By your logic, first some other (widely used) emulator would have to introduce such a feature, then it has to become widely used by programs, and only then xterm.js would adopt it. Which means that xterm.js would become kind of unmodern by then.

I'm quite sure that there are apps depending on xterm.js which would like to have such a feature, e.g. Hyper.

egmontkob commented 5 years ago

There are a gazilliion of various terminal extensions out there. E.g. the linked DomTerm page lists maybe about 50 custom escape sequences. I think it's fair to leave it for each terminal emulator's developers to decide which one they adopt and which one they don't.

As a VTE developer lurking around here, I love the idea, but I really wish there had been some cooperation between some popular terminal emulators for the design.

It's unclear to me how the \e[16u and \e[17u escape sequences defining the arrows are coupled together with the \e[83;...u (*) ones, and what happens in cases when they don't arrive in the "expected" order (e.g. the 17 just doesn't come), or the cursor is moved in between (e.g. moved backwards between the opening and closing 83).

(*) The attached Java program doesn't print 83, only 16 and 17, so I probably don't understand something correctly. It's unclear to me what sequence specifies exactly the toggleable block.

I don't understand the 1 or 2 characters story between \e[16u and \e[17u, e.g. how is an emitting application supposed to know whether the terminal emulators knows the "show" counterpart of the specified "hide" character? Also, doesn't this construct introduce the first ever case in terminal emulation when a character outside of escape sequences does something else than prints itself? Wouldn't it lead to unforeseen troubles? I'm also wondering why there's a need for this, and whether it would be better to let the terminal emulator pick its own preferred graphical representation (not necessarily something that could be printed by the app, perhaps some stock UI gadget).

The choice of the \e[...u framework is a pretty unfortunate one due to its potential clash with SCORC, in a similar way that DECSLRM kinda-sorta conflicts with SCOSC (see e.g. VTE 48).

My overall feeling is that DomTerm suddenly wants to do a whole lot of things at once that no terminal emulator did before (I, for one, would argue that nesting isn't necessary, it's more than enough to implement collapsing on the outmost level for the vast majority of use cases, more complex ones should done by terminal-based or graphical applications rather than terminal emulators), and (impression based on the fresh comments at DomTerm 54) the way it does them is not really properly thought through and mature. It's just my feelings without having tried it or closely studying it, not a solid opinion.

Of course, only time can tell if a feature is going to be successful or not.

albertz commented 5 years ago

Yes, I agree with most of what you said (except the nesting: I think that this is useful and also simple, once you have that feature in any form).

egmontkob commented 5 years ago

Nice screenshots of FinalTerm, but you know the project is discontinued?

albertz commented 5 years ago

Yes I know. But this was also only for reference. I actually found it in the code now (search for collapse_button and is_prompt_line), and it seems that there is no special escape code for collapsing/folding, just for marking the prompt line. I definitely do not suggest to implement it like that.

Tyriar commented 5 years ago

I'm also wondering why there's a need for this, and whether it would be better to let the terminal emulator pick its own preferred graphical representation (not necessarily something that could be printed by the app, perhaps some stock UI gadget).

To me it seems like generic folding doesn't seem to be that useful, that's why many programs have a verbose output option. What would be more useful imo is ways of flagging ranges, for example like iTerm's shell integration which lets the terminal know where the commands are, allowing a terminal emulator to fold the output if they wish. Something I wished there was earlier was the ability to flag a section of output with an alt text so screen readers read that instead of a graphical progress bar for example.

I.e. closing this issue here seems a bit strange to me. Does that mean that xterm.js does not want to lead such a role? I would suggest to reopen this.

Terminals supporting whatever they want seems like it's just going to lead to a more fragmented mess. I would hope in the future some form of standards body would arise to move terminals forward, as right now they seem somewhat stuck in time. I'm certainly open to being involved in a more coordinated effort though.

@egmontkob am I right in my assessment of the world here? Any insights here as I'm still relatively new to the scene 😄

albertz commented 5 years ago

I uploaded my use case for folding here (scroll down to the screenshot in the Readme), which is a Python package to print a Python stack trace, with extended information. You see an example there for the standard MacOSX Terminal, and then a screencast with DomTerm, which uses folding. I think this is pretty useful, at least for me (it provides a much greater comprehensive overview of the stack trace, while still providing the details if you want to see them). And this is also an example that nesting can be useful (again, at least for me).

Btw, also Travis supports nested folding.

egmontkob commented 5 years ago

I would hope in the future some form of standards body would arise to move terminals forward

Some of us are working right now on creating such a collaborative platform, expect an announcement/invitation soon :)

Tyriar commented 5 years ago

@egmontkob 👌

jerch commented 5 years ago

Intruiging idea, here are my first 2cents/questions:

Those are only first surfacing thoughts/questions regarding a possible integration into what terminals do. Yet this "5m distance perspective" alone leads to cumbersome constellations, I think folding will only work with good user experience when properly spec'ed beforehand and implemented in similar ways across different emulators.

PerBothner commented 5 years ago

A "live" dome is here. This is actually a "Saved as HTML" snapshot, but wrapped in JavaScript so text folding and dynamic resize (pretty-printing) work on the snapshot, just as they would in a live terminal.

jerch commented 5 years ago

A bit more detailed this time - still a list of wild thoughts from my side with some proposals/ideas:

Last but not least a halfway failsafe escape sequence should be found. This would have to deal with all sorts of faulty states like dangling marks and such. If going with the start/end mark thing this also raises the question how to deal with the open start mark while the end mark is not set yet.

jerch commented 5 years ago

@PerBothner Thx for the demo, looks pretty nice. Main question from my side - how do you deal with terminal size and the pty here? Modifying and formatting the data (inserting chars + autoindentation) without explicitly requested by the prog is way to much alteration of the original data for my taste.

PerBothner commented 5 years ago

"Main question from my side - how do you deal with terminal size and the pty here?"

The demo uses no pty and no server, except to serve static html, js, and css files.

The resizing/reflow all happens in the browser. It's similar to the reflow that some terminals (e.g. Gnome Terminal) do for wrapped lines - see issue #622. However, the application outputs markers into the output stream to mark structural elements. These are used to guide the line-breaking/re-flow - basically Lisp-style pretty-printing on-the-fly. These work even when the application (or the pty) is dead.

"Modifying and formatting the data (inserting chars + autoindentation) without explicitly requested by the prog is way to much alteration of the original data for my taste."

It is explicitly requested by program, using special escape sequence.

jerch commented 5 years ago

The demo uses no pty and no server, except to serve static html, js, and css files.

So this is not meant to run with a real pty and a shell?

PerBothner commented 5 years ago

"So this is not meant to run with a real pty and a shell?"

The folding/pretty-printing feature is definitely meant to be used with a pty and a shell. However, the demo does not use a real pty and shell. Think of it like an animated gif recording of an actual pty+shell session - but it's interactive.

albertz commented 5 years ago

My comments:

jerch commented 5 years ago

I would vote for having the folding buttons in the text itself. That makes it easier on every side (the terminal emulator does not have to introduce any specific UI area for this), and also gives more control/freedom for the shell or the tool which wants to make use of this.

It would not. In default ICANON mode this would mess with the lines sent from the pty - the pty has a notion of the actual terminal size and might send extra control chars like '\r' when the end is reached, if the terminal decides to add chars on own behalf for wotever reason this will fail badly. This not a voting thing, its more about being technically feasible. Since you mentioned the shell - from the perspective how the responsibilities currently are shared between terminal and shell this feature would suit better to the shell than the terminal. Since we have no widespread shell doing it atm, I wonder if this is needed/highly requested at all.

PerBothner commented 5 years ago

An admitted problem with DomTerm's folding is that it isn't as clearly specified as I'd like. There are two main use-cases I'm focusing on, and the constraints are different:

An enhancement of the latter is "lazy show". Some part of the output is hidden - and the application just sends a placeholder button, rather than the actual data. When the output is made visible, the terminal sends an escape sequence to application, which responds with a commands to update the newly-visible section of the output. This would be very useful for very large or "infinite" (cyclic) data structures. This is not implemented in DomTerm, but I have the outline of a protocol I can explain on request.

PerBothner commented 5 years ago

"It would not. In default ICANON mode this would mess with the lines sent from the pty - the pty has a notion of the actual terminal size and might send extra control chars like '\r' when the end is reached, if the terminal decides to add chars on own behalf for wotever reason this will fail badly. This not a voting thing, its more about being technically feasible."

I think you misunderstand, at least how DomTerm does it. The application explicitly requests where to put the fold buttons. When it comes to input lines, a properly-craft prompt string includes space for the prompt button, so the readline (or similar) library can calculate the correct spacing, without knowing that it's a hide button - it's just a random Unicode character. (This assumes that the prompt string syntax has a way to specify non-printing characters, of course.) When it comes to folding of output, ICANON is irrelevant.

You could even use a double-width Unicode character for the fold button in a prompt string, as long the input-editing library uses a suitable wcwidth implementation. In this case you need to make sure both hide and show character are the same width, but only include one of them in the "printing" part of the prompt string. (Or you can cheat: put two dummy single-column characters in the "printing" of the prompt, and override them with escape codes or a styling option.)

jerch commented 5 years ago

Yes, my bad ICANON will not affect this, its libs like libreadline that are affected by this for their line calculations. And the space trick would make the needed room. Thanks for clarification. Now what happens if that fold sign room ends up in one if the last line cells - how does the autoindentation deal with this?

PerBothner commented 5 years ago

"Now what happens if that fold sign room ends up in one if the last line cells - how does the autoindentation deal with this?"

Not sure what the concern is. The fold sign is just like any other printing character (perhaps styled differently) until you click on it. At which point it flips between the show/hide versions - and may show/hide subsequent text, which will force a re-flow for the affected lines.

There is no "autoindentation" per se. There are markers in the buffer, placed there by explicit escape sequences. The line-breaking algorithm may add or remove indentation (whitespace or special characters, as requested), depending on text folding and line width.

The line-breaking algorithm is a bit complicated (partly for performance reasons), so there could of course be bugs or under-specified corner cases.

PerBothner commented 5 years ago

In case it isn't clear: The demo isn't a pure text folding demo. It also makes use of an orthogonal feature: pretty-printing, which allows the application to specify line-breaking and indentation based on logical (application-specified) structure. All the auto-indentation is part of the pretty-printing feature set. While text folding is conceptually distinct from pretty-printing, they are of course designed to work well together.

sedwards2009 commented 5 years ago

I'm also the developer of a terminal emulator. It's goals are to extend and modernise the terminal environment with the kinds of features that albertz is talking about while at the same time remaining compatible with the applications in the terminal ecosystem.

Extraterm implements the "capturing command output" use case with its own shell integration which works via the shell prompts and/or pre-exec and post-exec hooks in the shell, (fish, bash and zsh). Command output is shown in a "frame" and separated from the surrounding text. You can also perform actions on the whole frame of output. Making it possible to fold it is on my TODO list.

I find this discussion quite interesting because I see the value in terminal applications being able to mark (nested) logical sections in their output as a hint to the terminal emulator.

Some quick thoughts:

I can see this feature working on a protocol level in a similar fashion as markdown headings. Each escape sequence marks the start of a level with a specified "depth". For example, imagine that the angle brackets represent the escape sequence in the VT stream:

This is text as the default level 0. This is text as the default level 0. This is text as the default level 0.

<level 1>The title of the level 1 block goes here
Text inside a level 1 block. Text inside a level 1 block. Text inside a level 1 block. 

<level 2>The title of the level 2 block goes here
Text inside a level 2 block. Text inside a level 2 block. Text inside a level 2 block. 

<level 2>Another block of text at level 2 with a title.
Text inside a level 2 block. Text inside a level 2 block. Text inside a level 2 block. 

<level 0>Back to the default level.
PerBothner commented 5 years ago

@sedwards2009: "This feature should not modify the terminal grid model by introducing extra characters or prompts or buttons which appear inside the character grid. This just makes it more complicated than it needs to be."

I disagree:

jerch commented 5 years ago

If buttons cannot be in the character grid, that means the terminal must allocate space for a "gutter".

Thats imho a good thing as it does not mess with the terminal view content directly. I see tons of problems with having those "outer things" in an active terminal cell (c&p, cursor jump, ECH?). Most code editors do pretty well with the line limited collapsing (even nested) and still only need "one column" one the left side to offer/draw that functionality. Imho thats a pretty common interaction model, so it would meet people's expectations. (Btw since you have the fold sign on every collapsed line its pretty much the same waste of space)

PerBothner commented 5 years ago

@sedwards2009: "It must be possible for the chosen escape sequence(s) to be ignored by terminals which don't understand it." I don't think this is a requirement, but it is of course desirable. It means that all text that should be visible on terminals that don't understand the feature should be outside new escape sequences; all text or commands that should (only) be handled by a conforming terminal should in an OSC or similar command.

The main issue is for fold buttons that appear inline (in the "character grid") and appear in a context where the application is tracking column widths, such as a prompt for an input line editor. In that case the prompt must include in the non-OSC text something as wide as the fold button.

PerBothner commented 5 years ago

@jerch: "Btw since you have the fold sign on every collapsed line its pretty much the same waste of space" Not sure I understand what you mean by the "fold sign". For the most common use case of folding shell command output, DomTerm only requires a character cell for the fold button in the prompt text before the input line. All of the output lines are full width, with no special marking, and no cell reserved for folding. An application can add indentation in addition to the folding, but that is 100% up to the application.

jerch commented 5 years ago

"fold sign" --> "fold button", however one might call it...

Im talking about this column: grafik

Another reason why I dont like the inline folds sign is readability, the output gets somewhat scrambled by those triangles. Thats not the case with a separation into its own dedicated column.

PerBothner commented 5 years ago

@jerch: That "column" isn't collapsed lines - those are all single-line shell input lines (comment lines). Most of the time a shell command will have some output lines - and those are full-width. For example this (old) screenshot.

sedwards2009 commented 5 years ago

PerBothner identified two main use cases for this kind of functionality, 1) folding command output, and 2) output of foldable tree data structures, i.e. from a REPL.

I think there is a 3rd distinct use case here and that is the example of Travis output as mentioned by albertz at the top of this issue. This is basically support for foldable "documents" where complete blocks of text are shown/hidden. Somewhat like the "outline view" in some document/text editors works. This use case is less demanding than 2) and wouldn't require "fold signs" to appear in the character grid, nor would it need to support fold areas inside lines.

Use case 1) is basically what iTerm2 and Extraterm's shell integration does. It's requirements are rather different than 2) and 3) and it should be handled separately from folding. (For example, Extraterm's shell integration sends extra data like the command being started and also things like the return code of the last command.)

Use case 3) "Travis", is what interests me the most because it would be very useful and has clear and obvious applications. Also if the escape codes are designed carefully then they could be safely used in the wild without causing havoc for people who use an terminal emulator which doesn't understand the new codes. This would increase the chances of terminal applications actually supporting and using the new fold codes.

I consider the solutions proposed so far for use case 2) to be too invasive and complicated. It extends the VT character grid model too far. I do like the idea of viewing different types of data inside a command line environment, such as a tree data structure. But in Extraterm at least I would support that by letting the remote end transmit a file with mimetype "text/json" etc and using a custom viewer at the terminal end to show it. I wouldn't try to shoehorn this into the VT character grid model.

Before we get bogged down in technical details we should at least agree on which use case we want to support and which ones are out-of-scope.

jerch commented 5 years ago

We should not forget that the tty/terminal is primarily for streamlined things, and does this pretty good with carefully crafted enhancements like color support. Its the whole success story of this oldish interface. Anything that goes into a document driven direction imho belongs to a dedicated prog that handles this particular format. And for complicated data representation there is still the alt buffer and the curses lib.

Therefore I think any folding support should primarily focus on the streamline side and keep things simple:

I agree that a spec should not focus on how to represent things to the user, still I think it should be considered during spec'ing and maybe give some recommendations to get a concordant behavior across the emulators that want to implement it. And ofc the whole folding story must be optional, thus the escape sequences must be crafted in a way, that the text flow/readability does not suffer if an emulator does not support it.

egmontkob commented 5 years ago

I've been thinking on it for a few days now (sorry, I've been busy to comment), and while first I was enthusiastic and the live demo indeed looks cool, having thought about it I'm much more on @jerch's and @sedwards2009's side.

While the demo is indeed cool, it is equally useful? Will users actually bother to click on those arrows to fold/unfold, will it save them a noticeable amount of time?

Both the functionality and the visual look (e.g. with the vertical, horizontal, angled and T-shaped line drawing characters) is quite arbitrary. Someone might come up with a somewhat different foldable treeview look and behavior, and question why it's not that one the terminals go for.

It's limited to one-time view, whereas with a decicated ncurses app one could, let's say, save and restore the folded state.

It's limited to a treeview, but there can be other similar things to consider, e.g. to place a long line of text inside a horizontally scrollable area, or have a slider for a number that specifies how much to fold, etc.

The terminal emulator is not a graphical UI toolkit, not something that offers widgets like a treeview, and IMO it shouldn't. This is the wrong level to solve this issue, the right level would be a dedicated application using ncurses or whatever similar (or a graphical app). It's magnitudes easier to solve it there (e.g. you get treeviews out of the box in real graphical toolkits), and the work doesn't have to be repeated across dozens of terminal emulators out there. (Mind you, being familiar with VTE's internals, I can tell for sure that it's a freaking huge amount of work to implement anything like this there.)

The current proposal tries to push interactive behavior from the app to the terminal, and this isn't a direction we should be taking. Defining and implementing the behavior belongs to the app; displaying the outcome of that behavior is the terminal's job.

With traditional terminal emulators, you always have to think: what happens if the cursor is repositioned to a given absolute location and data is overwritten there. What would happen here with the treeview feature? Would the absolute coordinate jump to a different logical place depending on whether the user has folded a section? Or would folding remain a view-only, and the terminal emulation always work on the unfolded layout?

Will terminal emulators and text viewers likely implement this feature? Will it work inside less? Inside screen or tmux?

I'm trying to compare this feature to explicit hyperlink support, regarding its estimated usefulness, cleanliness, amount of required to support it, likely adoptation etc. And it doesn't look good. Mind you, the explicit hyperlink feature isn't as popular as I deeply in my heart hoped it would be. less's author tracks it as a feature request, and my proof-of-concept demo no longer applies to the newest versions. tmux's author has rejected it for the time being. The explicit hyperlink feature, in my opinion, serves a clean goal, is useful in quite some use cases, and there's nothing arbitrary in its user facing side. Compared to that, from this foldable thingy I'd expect an even lower enthusiasm from people to adopt it, both from the terminal and from the application side, and I see a smaller number of use cases where an app could actually use this feature. Plus, this is combined with any decision being arbitrary, debatable, and if someone doesn't agree then they'll likely go with ncurses or a graphical app anyway, or propose further extensions to the already complicated and arbitrary terminal feature. I don't see the required efforts being worth it.

IMO let's not turn terminal emulators into more powerful UI toolkits.

What I believe could be a better approach, in multiple aspects, is to invent escape sequences that add semantics to the data stream. (This is I believe what e.g. shell integration in iTerm2 already does.) Such semantics could be the nested level. And then various terminal emulators could each experiment with ways of offering folding/unfolding, or quick jumping to the next/previous, or such, resulting in a healthy competition among terminals for the users' benefits.

About whether nesting is required or not: I'm still not sure. Let's assume that the default nesting level is the level of a single command (as e.g. seen in the top 5 lines of the DomTerm demo). It's a nice extension to tell the terminal emulator where a command's output starts and ends, and a terminal might make them foldable. But there could be a desire to have other indentation levels, both outside and inside. Inside a single command, one could define further indentation levels as seen in the DomTerm demo, or the output of any recursive command (like make, find, ls -R) could place such semantic marks. Outside, there could be levels for entire shells, ssh sessions etc.

It's unclear to me at this point whether escape sequences could define such levels in a safe, robust way, that is, allow all kinds of nesting combinations (e.g. java printNested inside ssh inside ssh inside zsh inside bash inside ssh...), in a way that recovers after an app crashes.

Tyriar commented 5 years ago

What I believe could be a better approach, in multiple aspects, is to invent escape sequences that add semantics to the data stream.

+💯, this sums up my feelings from this discussion. I implemented "command tracking" in vscode but right now it's just guessing what a command is and is very hit or miss. The lack of this semantic information is imo the main blocking factor that prevents taking terminals to the next level.

PerBothner commented 5 years ago

"What I believe could be a better approach, in multiple aspects, is to invent escape sequences that add semantics to the data stream."

DomTerm has escape sequences to delimit prompt, input and error text. (Output is presumed to be the rest.) See the CSI 12u, 11u, 14u, 13u, 18u, 15u in the linked-to documentation.

There are escape escape sequences to delimit command groups, which may be nested. See the osc 119, 120, 121 commands. See here for suggestions how to use these escapes. The basic usage is that each prompt specifies a 119 escape to indicate start of new group, and implicitly the end of an existing group with the same id. To support nested command groups, specify a group-id. This could be a process id for the shell. Optionally send enter/exit group commands when you start/end a shell, but just using the 119 escape with group-id seems to handle nesting fairly well, without explicit enter/exit commands.

It would be great if we could standardize these or similar commands.

sedwards2009 commented 5 years ago

Extraterm also has some custom escape sequences too. I've designed the sequences with a degree of security paranoia in mind. Each sequence contains a cryptographicly secure random "cookie" parameter. Each terminal tab in Extraterm requires a different cookie before it will accept the escape sequence. The cookie is available via an environment variable.

A remote application can use the escape sequences because it can read the cookie from the environment variable. But using cat to show the contents of a log file, for example, is still safe because it can't use or trigger the escape sequences. The log file contents won't have the cookie.

We trust applications, but we distrust any "data" which has found its way into the VT stream.

egmontkob commented 5 years ago

with a degree of security paranoia in mind

This sounds like pure and pointless paranoia to me. There are tons of other escape sequences that can occur in a text file and can mess with your terminal in various ways, e.g. trigger some responses as if they were typed from the keyboard, wipe the scrollback buffer, switch to weird character encoding, switch to invisible letters, invisible cursor and an unreadable color palette etc. These can all mess up the terminal, but are otherwise harmless. Incorrectly defining semantical blocks, e.g. the location of the prompt, isn't any more serious than these. (Some others, not implemented by many terminal emulators due to concerns, can even do more risky things like initiate a resize or move of the terminal window, set the clipboard contents etc.) It's a user error to cat (and not cat -v) an untrusted file and expect it never to mess up the terminal. In the mean time, if it's a log file, I'd argue that it's also an error in the producer of the log file if it can contain escape sequences; a log file shouldn't be untrusted data.

sedwards2009 commented 5 years ago

Yes, I am aware that there are heaps of ways of messing up a terminal and making it unusable. I'm not really worried about things getting "messed up". I'm concerned about more serious security problems which could creep in as I add more features and escape sequences.

It's a user error

Allowing data dumped straight to the terminal to have access to potentially dangerous escape sequences is stupid when it can be avoided. There is no up-side to allowing this. There is no use case here. Building traps for the user and then blaming them when it goes wrong by calling it "a user error", is a horrid approach to security.

Err on the side of security first, and not the other way round is my advice.

egmontkob commented 5 years ago

There is no up-side to allowing this.

To make them work seamlessly across ssh, across su/sudo, inside detached and reattached screen/tmux. To make them recordable and replayable using script/scriptreplay. To make them redirectable to another terminal...

Security is important, but if it unconditionally triumps everything then I doubt you can end up with anything usable. You can't password-protect every command that let's say changes the color, or defines a foldable section, or whatever.

New escape sequences, new features should be added with care, and if in doubt with its security or privacy implications then rejected.

My advice is to stick to existing practice, treat security with common sense and not confuse it with paranoia.

Anyway, it's getting pretty off-topic...

PerBothner commented 5 years ago

@egmontkob: "While the demo is indeed cool, it is equally useful? Will users actually bother to click on those arrows to fold/unfold, will it save them a noticeable amount of time?"

Well, this feature appears to be common and used in debuggers - at least JavaScript debuggers. A goal for DomTerm is not just a solid terminal emulators, but also a toolkit for REPLs. Like some of the kinds of things people use Jupyter for. Output with images, mathematics, rich text. I'd live to be able to seamlessly switch between that kind of "REPL toolkit" and xterm-compatible terminal, and mix-and-match their features.

"I consider the solutions proposed so far for [foldable tree data structures] to be too invasive and complicated."

I disagree - but it may be too complicated and esoteric to attempt to standardize it at this point.

However, it seems worthwhile to standardize a way to delimit prompts, inputs, and commands (possibly with nesting), in a way that can be put in a prompt string. Some terminals add an indication in a gutter area. Other terminals could add appropriate actions context (right-click) menu when the mouse is in the prompt area, such as folding

jerch commented 5 years ago

Yeah the semantic idea looks most promising to me, it would make it possible to deal with a bunch of purposes at once, yet let the emulators deal with the "how", which could range from ignore to present the data in a super special fancy way. Also competition in this field is good as emulators might try different solutions. Btw we already have some semantic escape sequences in OSC like the title thing, I would also account egmont's URL proposal in this field. This might already give some hints on how to layout things on escape sequence level, discussing it here might be beyond this thread.

About security (slightly offtopic) - I dont see a higher security risk per se from a "semantic terminal", as it still works on the common unix rights system/privilegies and as long as it is only dealing with data representation. Problems might arise though by tricking the user to do unwanted things (see our first and hopefully last CVE, or search for malicious escape sequences). For semantic additions, that deal with hiding content (like folding), this might be a problem:

#> echo "Hi <hide>" && maliciousCommandB && echo "</hide> there!"

If <hide> is an OSC sequence the casual terminal user can hardly tell anymore, if there is something fishy going on. Are we entering this world of security issues with it? Imho we already have this problem, many webpages propagate their super-dooper cmdline toolset via:

#> wget .... | bash
or really frightening:
#> wget ... | sudo bash

and no one really inspects that downloaded stuff beforehand, lol.

Edit: Btw the URL proposal has the same tricking peeps issue, but it does not harm anyone until the URL might get called (leaving the data representation only field). Thus a semantic addition that carries actions beyond the visual things might need extra security countermeasures.

egmontkob commented 5 years ago

> echo "Hi " && maliciousCommandB && echo " there!"

I don't think a semantical addition should be able to instruct the terminal to hide the contents. A semantical addition could inform the terminal: this is a prompt, this is the command entered after the prompt, etc. IMO they should all show up by default as before.

If the terminal offers a way to collapse a utility's output, and the user does so, there are still plenty of possibilities. Highlighting could copy the shown parts only. The emulator could auto-show a block if the contents within changes. Terminals could present a popup if the user copies hidden text. And so on. It's of course wise to think about these, and if we have a proposal, document the possible security issues we can foresee with various UI ideas that terminals might implement based on these sequences.

wget .... | bash [...] no one really inspects that downloaded stuff beforehand, lol

Same goes for when you download or git clone a repository and compile/install/run it.

URL proposal [...] does not harm anyone until the URL might get called

In the comment section of that proposal I argue that they shouldn't even cause harm if the URL is clicked, but IMO let's not derail this thread, this should be discussed over there if there are any remaining concerns.

jerch commented 5 years ago

I don't think a semantical addition should be able to instruct the terminal to hide the contents

Yeah its orthogonal to the semantic thing, it still belongs to the folding idea this thread was started with.

Should we move the semantic discussion to another thread?

albertz commented 5 years ago

Independent whether this is just about semantics or specific about folding: keep also in mind that there are both use cases where you want to hide the text by default (see eg Travis, or my better exchook use case), or where you want to show it by default (eg any command output). You could have two separate escape codes for both cases.

sedwards2009 commented 5 years ago

@egmontkob

Security

I think you are overestimating the impact of the scheme I described. I use this daily in Extraterm for a number of features and it works across sudo and ssh.

But won't work through screen/tmux without changes to those apps, but that applies to almost every new escape sequence we could come up with.

I grant you that most escape sequence additions are likely to be benign and not require this kind of security approach. But at the same time I've already got some extra sequences in my terminal project which I wouldn't feel remotely comfortable having available to untrusted data.

sedwards2009 commented 5 years ago

@PerBothner

REPLs... Output with images, mathematics, rich text

I'm also interested in these use cases. The approach I'm leaning towards is allowing application to send data in common formats (jpg, png, svg, etc) and display them in between VT output, effectively stopping the grid, inserting the image/whatever, and then creating a new empty grid with position (1,1) immediately under the outputted image.

sedwards2009 commented 5 years ago

Should we move the semantic discussion to another thread?

Yes please. Some kind of general mechanism for associating data and metadata (URLs, file paths, hostnames, git branches, etc) with things in the VT stream could be very useful. It would also be a huge discussion too.

egmontkob commented 5 years ago

@albertz

where you want to hide the text by default

I'd argue that this goes beyond the scope of defining semantics into defining the desired behavior, and as per my previous comments, I'd rather see this responsibility and functionality remaining at custom applications and not being pushed to terminal emulators.

PerBothner commented 5 years ago

@egmontkob : "I'd rather see [data structure folding] remaining at custom applications and not being pushed to terminal emulators."

A custom application running in a terminal using an ncurses-style library can't fold text except within the visible (non-scrolled) part of the output. I don't believe there is any xterm escape sequences to manage scrolling or navigate above the "home" line. Of course an application can repaint scrolled-out output, but it's not integrated with the scrollbar. It might be interesting to design a protocol to deal with scrollbars and large scroll regions, but I haven't seen anything like that.

@egmontkob : "The terminal emulator is not a graphical UI toolkit, not something that offers widgets like a treeview, and IMO it shouldn't. This is the wrong level to solve this issue, the right level would be a dedicated application using ncurses or whatever similar (or a graphical app)."

One way to use DomTerm is as a high-level GUI toolkit for writing rich REPL consoles. You call the toolkit using escape sequences rather than procedure calls. This is much easier and powerful that using a ncurses-style library (if you're implementing a rich REPL console, not necessarily for oher things you might use ncurses for). Plus you have the same programing interface for rich and plain terminals, just with a downgraded UI for the latter.