buildkite / terminal-to-html

Converts arbitrary shell output (with ANSI) into beautifully rendered HTML
http://buildkite.github.io/terminal-to-html
MIT License
642 stars 45 forks source link

Add iTerm-style hyperlinks #163

Open DrJosh9000 opened 3 days ago

DrJosh9000 commented 3 days ago

Fixes #58 and #116 by adding support for OSC 8 sequences.

Background

terminal-to-html emulates a terminal screen consisting of screenLines, each of which has various runes on it in nodes. Each node contains the rune at that location and its style. terminal-to-html also has an element model, where some nodes can be elements instead of runes. This was used to implement iTerm style inline images, and a different link sequence (OSC 1339 ...) where the URL and link content are specified with key=value pairs inside a single escape sequence.

OSC 8 links work a bit differently to OSC 1339 links. OSC 8 links function a bit more like anchor tags in HTML: ESC ]8;;url ST starts a link like <a href="url">, and ESC ]8;; ST ends a link like </a>. This suggests an implementation where we directly translate OSC 8 sequences to HTML.

Unfortunately translating OSC 8 sequences to <a> ... </a> directly doesn't translate well to our model.

Naïvely, if we store an <a> opening tag as an element, it would occupy an entire node. Similarly with the closing tag. This would expand each link on either side by a space.

Nodes could be made capable of storing both elements and runes. This would negate some of the work I did on memory usage, but that cost is probably acceptable after the streaming work.

Either way introduces some interesting edge cases with some of the other escape sequences, particularly cursor movement. If an <a> tag is opened on one line, and before it is closed, the cursor is moved up one line before closing, then the </a> would appear in the output before the <a>! If we want to prohibit cursor movement within OSC 8 pairs, then we would need another parser mode, and expand the parser to be able to return to a previous mode (e.g. with a stack) rather than directly to "normal".

Research

What does iTerm2 do? Fortunately we don't need to dig around in its source, we can just cat a test file in an iTerm2 window and find out.

Screenshot 2024-07-04 at 1 45 40 PM

So it appears iTerm2 treats OSC 8 like a style: ESC ]8;;url ST starts painting the following text as a hyperlink to the URL, and ESC ]8;; ST stops painting the text that way. The style then follows the text wherever it is written.

Implementation

Firstly, along with the cursor position and current style, the screen now also stores the "current URL" as urlBrush (like a "link paintbrush"). If it's not empty, that is the URL of the link that will be applied to the text being written, and if it's empty then the text being written won't be linked.

We don't have space in the style uint32 to store a URL, or even an index into a slice of URLs (with 7 bits left, we could index 128, which is probably plenty, but is an arbitrary-seeming limitation). But most text won't be a link. This suggests storing URLs for links in a sparse data structure based on the coordinate of the node. A map fits the bill.

The most complex part is now generating <a>...</a> pairs while also generating style <span>...</span> pairs.

We can skip doing a lot of map lookups by using a single style bit to indicate the presence of a link. Since most lines won't have any links, this could be a premature optimisation, but I think it makes some of the code nicer.