Open PerBothner opened 1 year ago
Unfortunately thats not possible atm and also not easy fixable, as there are several conceptual issues to overcome:
To quick draft a possible solution:
onSerialize((bufferLine: number) => [xStart, xEnd, content][])
(well something like that)onSerialize
listeners, ideally its own FG/BG/attribute content generator is also only a listener here.
I am not very fond of the current state of the serialize addon, not sure if such an extension is wanted (argued in the past about it, but kinda gave up). Also the proper image encoding protocol remains unclear atm.
@PerBothner Drafted a quick demo PR to test the idea above (see #48). Ofc this cannot be used until the serialize addon sees major changes. Also the image re-encoding is very rough atm (using IIP for every single cell, which is not how it should be done).
There is prolly a simpler solution possible - the image addon currently always draws on top of FG/BG content not erasing it. Thus it should be possible to keep most of the serialize addon as it is, but invoke a full line image serializer for every line, and just append its sequence to the current line. This has several advantages:
Will see if that works better...
Update: I've implemented the line based serialization with a23a3725fa16368a4e59d7cd9e0ac3e6b5fd75be, which works much better and is also reasonably fast.
Addendum regarding a proper protocol:
It turns out that IIP has all the basics to guarantee a stable cell coverage, if dimensions are set with plain numbers denoting row/col extend, eg. width=80;height=1
for a single terminal line of 80 cols. It still needs the overprinting patch of 64244d8 to level out float rounding issues, in fact the floor/ceil logic must be applied all over the place to avoid stitching/striping artefacts. Furthermore early ending images need a proper cell tiling patch to fully cover a cell to keep aspect ratio stable, when read in back (commit be72b06). This might be some food for a better protocol, most likely we need to address into fractions of a cell, as already suggested by @egmontkob years ago.
Nonetheless IIP with col/row metrics works good enough atm to get a working solution for serialization running. :smile_cat:
Edit: The line serialization is basically the same issue any multiplexer would have when updating screen parts / certain cells - to keep the image behind "intact" and to avoid uploading data over and over we also gonna need the "upload to image ref + draw from image ref" scheme as already discussed in several places.
The QOI image format might be a good choice for image serialization, as it provides very fast encoding/decoding with okish compression rate.
Some early numbers testing QOI decoding speed with the sixel-bench test data:
qoi: 2.72s (48.06 MB/s, 251.76 fps)
png: 7.56s (18.88 MB/s, 90.62 fps)
jpg: 2.25s (7.18 MB/s, 304.94 fps)
sixel: 3.03s (39.16 MB/s, 226.2 fps)
(Note: The JPEG numbers are flawed, as I did a simple 75% conversion, which reduces data from ~130MB to 17MB for jpeg.)
Did not test yet the encoding path, but I expect this to be much faster than any PNG/JPEG encoding.
A simple way to serialize a canvas is to use canvas.toDataUrl()
. That seems simple, portable, and reasonably efficient.
To unserialize, you can create a new Image
, set the src
attribute, and optionally drawImage
into the canvas. There may be a way to avoid creating a temporrary Image
object.
You still have to decide on an escape sequence to wrap the data url in.
@PerBothner Well, thats what my playground branch does atm: https://github.com/jerch/xterm-addon-image/blob/a23a3725fa16368a4e59d7cd9e0ac3e6b5fd75be/src/ImageAddon.ts#L322-L329
and the encoding speed is underwhelming - it takes ~70ms for an image of 420x420 pixels. To me this seems not to be viable option to serialize a longer scrollbuffer with possibly multiple images in it, as it will add up to seconds of blocking time pretty fast.
Thats where the promise of QOI may take place - it encodes slightly worse than PNG, but at 30-50x faster speed, which would reduce the runtime of 70ms easily to <<10ms.
NB: The high runtime is partially artificial from line-wise handling, a single toDataUrl()
on the whole image takes only ~28ms. To fix both, prolly QOI + ref painting might be needed. But as said above, I still have to see whether QOI can hold its promise when written in wasm. Furthermore it needs a base64 encoder + string conversion on top, thus I think we might not see the full 30-50x speedup in the end, maybe around 5-10x.
Update:
A quick test with a hacky not yet optimized QOI encoder finishes the same image in ~10 ms, thus runs in a third of the time compared to toDataUrl()
with PNG with its 28-32 ms. The decrease is much less than I expected, and it turns out, that the main limiting factor is the back transfer of data from the GPU. This becomes clear, when looking at the profiler data:
Also tried to avoid getImageData and read pixel values back from webgl texture, which was slightly better (~4x faster):
Still way to go...
Haha - stumbled over the next hassle - fast bytearray to string conversion. I really dont get it why JS has no fast Uint8Array.toByteString
method, thats literally a malloc + memcpy away for the engine. But nope, we have to use slow JS looping with String.fromCharCode
calls, :see_no_evil:
Well did not investigate much yet beside testing various concat tricks, thats what I ended up for now:
const data = ... // uint8 array from encoder
let p = 0;
let s = '';
while (p < data.length) {
s += String.fromCharCode.apply(null, data.subarray(p, p += 65536) as any);
}
Also TextDecoder
seems to be on par with this, but I am not sure yet, whether it will run into issues for certain charcodes. Needs some more testing...
Some interim results:
getImageData
gets much faster with a software canvas (willReadFrequently
setting), and is much more reliable and stable in timings than any webgl hacks. It is also not hardware dependent. Clearly the better choice.@PerBothner Imho QOI is the right decision, as it is able to serialize things 2.5 - 5x faster than with browser methods (see full roundtrip numbers and explanation here: https://github.com/jerch/xterm-addon-image/pull/48#issuecomment-1484145335).
@PerBothner
Hmm, currently the serialize plugin is pretty slow, it takes ~170 ms to serialize a 10k scrollbuffer of 87 cols (tested with ls -lR /usr/
generating ~500kB of string data). Thats ~8 times slower than a wasm version with almost the same functionality (takes ~20 ms).
Therefore I might ditch the idea to extend the serialize addon, instead a rewrite seems more appropriate to me. It makes it also easier to rethink the serialization problem and come up with a proper interface, where custom extension can hook into.
Does it really matter if serialization is a bit slow? How often do you need to do it? Can you think of any usecase for a terminal where even half a second would be a problem? Of course an extensible interface is valuable even if performance isn't the main motivator.
Where I think speed might be more valuable is detecting grapheme clusters, which can be combined with detecting wide characters - see my library. Though detecting clusters while rendering seems sub-optimal - perhaps cluster (and wide character) detection should be done lazily, the first time a line is rendered. Anyway, this is a whole different set of issues, which I have some thought and experience in.
Well the issue with the serialize addon is simple - it is convoluted for no good reason, so I dont like to alter things there and introduce an interface on top. I would basically end up rewriting it anyway, so a clean start makes things alot easier. The perf bonus is just the brick to get me doing it.
@ grapheme clusters I had a first attempt on those with xterm.js back in 2019, but it did not make it into the release for code bloat (from storing codepoint attributes) and big runtime issues (was multitudes slower). Imho it is not yet worth to be added in xterm.js with its current input data route. In the long term I plan to replace critical hots paths with wasm implementations, but I dont know yet when, and if those ever will make it back into xterm.js. Thus I am more focused on doing things on addon side.
For grapheme clusters, why do you need to store codepoint attributes, assuming you're using either dom or canvas renderers? The browser takes care of that. You may need to know cluster boundaries but you can detect those at the same time you're detecting double-width characters, using an efficient trie lookup (which would probably be a good candidate for wasm).
"If those ever will make it back into xterm.js" If I get to the point of switching DomTerm to use the xterm.js engine by default, I'm assuming it will have to be a fork, since the xterm.js maintainers don't seem to be interested in rich data or variable-width text.
For grapheme clusters, why do you need to store codepoint attributes, assuming you're using either dom or canvas renderers? The browser takes care of that. You may need to know cluster boundaries but you can detect those at the same time you're detecting double-width characters, using an efficient trie lookup (which would probably be a good candidate for wasm).
Yes, I meant that lookup data for the unicode codepoint range to eval incoming chars. I dont quite remember, what I used back then, but it was quite big (I think it was a condensed bitmap for fast lookup). So no, I was not storing additional attributes in the buffer, it was just the segmentation logic, that needed additional lookup data.
Btw I think other TEs implementing grapheme support put the segmentation burden early right after the utf-8 decoding before doing wcwidth
, thus they keep things/graphemes cell aligned. Technically thats a grapheme_wcwidth(Grapheme*)
in the end, and imho the only way to keep things within that nasty grid metaphor.
The impl impact is quite remarkable though - unicode does not limit graphemes in x direction, so it is theoretically possible to end up with cells >2 half widths. Thats what most TEs will have issues with, and so does xterm.js in its renderer logic. This gets really horrible for heavily stacking language systems with tons of subscripted perceivable characters (hebrew, arabic, hindi languages etc.), widths may even be in fractions of a cell here. But unicode did not bother with formalizing that at all, they left it on purpose to font developers to get the best visual output. Thats really a nightmare for proper grapheme dealing on a grid system, and I have not seen anyone to fully address this. So all TEs are currently broken in their unicode support. (And last but not least - there is still the RTL issue on top...)
Anyway, its kinda offtopic here.
Off-topic - about grapheme clusters: The table for codepoint attributes used in DomTerm is a Uint32Array of length 12764, initialized from a 4032-character atob string. I consider that acceptable.
Basically DomTerm treats a cluster as wide if any of the constituents is wide, or it sees an emoji_presentation_selector or a regional_indicator pair.
This gets really horrible for heavily stacking language systems with tons of subscripted perceivable characters
That is why terminals need to figure out how to handle variable-width fonts. On my roadmap is a wrapper for input editors (such as GNU readline) where the terminal uses a variable-width font (layout handled by the browser engine), but it pretends to the application that it's all regular fixed-width columns. This would be enabled by some extra shell integration flags without applications having to be otherwise modified. Applications that know they are running under DomTerm can request that a variable-width font be used for output.
I got xterm.js + xterm-add-image working under DomTerm. However, it doesn't appear a sixel-produced image is saved by the serialize addon.
Specifically, I printed some images (using lsix), and then did a "detach session". When I then did "attach" on the session, the images were gone.
Serialization is important not only for detach+attach, but also when dragging a window to a new top-level window - this also uses serialization on most platforms.
In "regular" DomTerm (not based on xterm.js), an image is represented as a
<img>
element with adata:
URL. When a buffer is serialized as HTML, this automatically handles such images. (FWIW, I'm actively working on improving the sixel handling in regular DomTerm.)