Open AnonymouX47 opened 4 months ago
First regarding your side question:
Are you aware of any means for a program to detect that it's running within xterm.js (and maybe it's version)?
Sadly no, I dont. I tried to address the issue several times in the xterm.js repo, but it kinda never led to a working solution. Main issues around that topic is the fact, that xterm.js is a TE lib, where ppl build their own TE with. Also the lib is highly customizable in exposed features up to additions from addons, thus xterm.js != xterm.js, plus real changes in VT support between different versions. new attempt: https://github.com/xtermjs/xterm.js/issues/4982
About the cursor positioning with IIP: I did that to uniformly handle the cursor for different image sequences. In the beginning the addon supported all types of cursor modes (from xterm, mlterm etc), but after a detailed discussion with @j4james and @hackerb9 it became clear to me, how flawed the whole situation is and that the left-bottom corner of a graphics printout is the only determined point across all formats, thus removed the madness alltogether. Or to say it differently - yes I deviate here intentionally from what iTerm implements for its sequence, but since its not documented behavior, it cannot be expected as such.
First I think a uniform behavior is better than having 10 different cursor modes working only for sequence XY, thus I am reluctant to change to "your expectation" here. Same goes for the wezterm addition, it is just yet another cursor mode working only with IIP.
If we really want to solve the cursor positioning after graphics output once and for all, I'd say we need a sane default (which is VT340 mode - left-bottom edge), and maybe introduce a terminal mode selecting one of the of the other graphics corners, where applicable, e.g. 0 - bottom-left (default, always for sixel level 1), 1 - top-left (same as wezterm addition), 2 - top-right, 3 - bottom-right (IIP in iTerm).
First regarding your side question: ...
Hmm... True. I'll probably chime in to the new discussion if I have any ideas to contribute. Thanks
About the cursor positioning with IIP: ...
Hmm... interesting. :thinking:
First of all, were you, j4james and hackerb9 discussing about sixels specifically or terminal graphics in general?
I sincerely don't think introducing a new sequence is a viable solution as it'll only end up widening the problem we're trying to solve. Yes, uniformity across all protocols is desirable but we both know it'll never happen for various reasons. I think uniformity across implementations of the same protocol is what matters and it's really sufficient for any application i.e I know I'm emitting X sequence and Y is the expected cursor position/placement after graphics is drawn.
One graphics protocol/sequence really doesn't/shouldn't affect the other. On that note, I also wanted to mention earlier, the fact that DEC{SET,RST} 80
(sixel scrolling) affects IIP but just decided to focus on the main thing.
Yes the discussion was about sixel, as that was the only widespread graphics sequence, where ppl already had implemented tons of alternative cursor mode ideas. Which are all faulty for sixel in particular, only bottom-left corner works here reliably.
Yes, uniformity across all protocols is desirable but we both know it'll never happen for various reasons.
Well at least xterm has changed its cursor positioning to vt340 mode after it became clear, that the old mode had serious flaws. Graphics output in terminals is still very alpha in many regards, so I think most TE maintainers are willing to change it, if there are good reasons to do (minus kitty). IIP does not spec its cursor handling, so in a sense of uniformity between graphics sequences the least denominator would be the vt340 mode here as well. I think this just would have to be discussed with @gnachman and @wez and whether they would agree on a more uniform handling (and all other TEs implementing IIP).
About a possible corner marking sequence for cursor positioning: I dont think that such a sequence is really needed, my suggestion is just to make life easier for app devs and not to repeat subspeccing it on sequence XY. The idea is pretty simple - place cursor after a rectangular graphics sequence at the selected corner, again defaulting to vt340. Implemented as terminal mode it is not important anymore, which transport sequence was used for the image/graphics.
About sixel scrolling: Thats implemented as a terminal mode, thus applies to any graphics sequence.
On a sidenote Imho IIP is a much better graphics protocol than sixel, I really dont get why ppl insist on using sixel (besides for historical reasons).
Imho IIP is a much better graphics protocol than sixel, I really dont get why ppl insist on using sixel (besides for historical reasons).
Interesting, what is exactly IIP ? And is it available on all os distributions as a portable (and signed ...) application and also xterm.js ?
windows : mintty provides sixel. linux : mlterm provides sixel. web: xterm.js provides sixel. osx/ios: safari + xterm.js of course !
@pmp-p
Interesting, what is exactly IIP ? And is it available on all os distributions as a portable (and signed ...) application and also xterm.js ?
IIP is "Inline Images Protocol" as invented by iTerm2. You dont need any application or special lib for that, it basically consists of a sequence with base64-encoded PNG/JPEG payload.
Docs: https://iterm2.com/documentation-images.html Example in Bash: https://iterm2.com/utilities/imgcat
Currently not many TEs support it, but opted for sixel instead. Which is a pity, since it has much better quality and can be handled by any semi-experienced developer with stdlibs.
@jerch
Without a doubt, I agree the bottom-left cursor placement is the most reasonable, straightfoward and reliable for various reasons (I think documenting these reasons will be a neccessary for a pitch). Even if a protocol will have alternate modes, this should be the default IMO. My concern is majorly about the amount of time and effort it'll take to make this go round (assuming other TE maintainers would be willing to change), and just like you, there's the case of those for which I have low hopes.
On the happier side, changing the cursor placement for IIP to the bottom-left cell should have any much adverse effects on applications, as I'm not aware of any project other than mine (and maybe Jexer, which has been archived for a while) that depends on the cursor's horizontal position after drawing graphics.
Like you said, I don't think the corner marking sequence is needed.
About sixel scrolling: Thats implemented as a terminal mode, thus applies to any graphics sequence.
I see.
On a sidenote Imho IIP is a much better graphics protocol than sixel, I really dont get why ppl insist on using sixel (besides for historical reasons)
Same thoughts...
Without a doubt, IIP is far much easier, both for TE and app devs, and provides a richer feature set. Though, I guess some factors would be:
XTerm implemented sixels a long time ago and many TEs have followed in its steps.
A higher percentage of TE users are only aware of sixels (also kinda due to the previous reason) and that's mostly what gets mentioned in feature requests.
IIP would require external dependencies for decoding various image formats, unlike sixels.
Side note:
IIP is "Inline Images Protocol" as invented by iTerm2. You dont need any application or special lib for that, it basically consists of a sequence with base64-encoded PNG/JPEG payload.
That reminds me, I recently tried out chafa's IIP output format and images weren't displayed. I chased down the possible causes and it boils down to the fact that chafa currently uses uncompressed TIFF format for IIP. So, it's either:
The iTerm2 doc actually states:
Any image format that macOS supports will display inline, including PDF, PICT, or any number of bitmap data formats (PNG, GIF, etc.).
thereby leaving supported formats essentially unspecified/open-ended.
I discussed with @hpjansson (on the chafa matrix chat) about this and his major reason was PNG encoding would require an external dependency as it's non-trivial, unlike uncompressed TIFF... but now he's willing to allow an external dependency for PNG encoding.
If you would like me to open a new issue for TIFF support, please let me know.
EDIT: https://github.com/GuardKenzie/chafa.py/issues/54 was actually what triggered my investigation.
- IIP would require external dependencies for decoding various image formats, unlike sixels
Yeah that is indeed a downside up to security issue in foreign libs for more complicated formats. I am really a fan of QOI in this regard, such a small code size makes it almost impossible for bugs to slip in. And it is still reasonably fast with good enough compression, at least for local delivery.
about TIFF support: Yes I already figured that out several weeks ago, when I was chatting with @hpjansson. Browsers dont provide builtin TIFF support, so this has to be built in JS/wasm. Not sure yet about its complexity, but I have that on my longer TODO list, so yes - plz feel free to create a feature request for it. (Sorry for not showing up in the matrix channel for a while, have pretty limited time atm...)
An open-ended TIFF decoder would be complex to implement from scratch. @AnonymouX47 mentioned this places an unfair burden on the TE, and I agree. I'll move Chafa to emit PNG in the future. Haven't decided whether to do so with an external dep or embedded. The CLI application embeds a PNG decoder already.
(Sorry for not showing up in the matrix channel for a while, have pretty limited time atm...)
Figured :-) Similar thing going on here, it's that time of the year I think.
As far as cursor positioning goes, I should specify it. The cursor should be after the last cell in the image. Future bidi support could affect the definition of "last", but for now that is the bottom right corner. That could leave the cursor in the position after the last column.
For an image taking width
x height
cells, you would move the cursor to its new position by following these steps:
height
linefeedswidth
cells.For image decoding, I strongly recommending doing that out-of-process. iTerm2 uses a sandbox like Chrome and there has never been a security issue.
Regarding formats, I think it makes sense to aim for parity with web browsers; at a minimum that's JPEG, PNG, GIF, SVG, WebP, BMP, TIFF, ICO.
@jerch
- IIP would require external dependencies for decoding various image formats, unlike sixels
Yeah that is indeed a downside up to security issue in foreign libs for more complicated formats. I am really a fan of QOI in this regard, such a small code size makes it almost impossible for bugs to slip in. And it is still reasonably fast with good enough compression, at least for local delivery.
If only it were as widely adopted already. I see it's propagating fast but still not sufficiently widely supported.
about TIFF support: Yes I already figured that out several weeks ago, when I was chatting with @hpjansson. Browsers dont provide builtin TIFF support, so this has to be built in JS/wasm. Not sure yet about its complexity, but I have that on my longer TODO list, so yes - plz feel free to create a feature request for it.
I see... I'm really not that keen on it due to the my understanding of the required complexity and since it's already been on your TODO, it's okay.
(Sorry for not showing up in the matrix channel for a while, have pretty limited time atm...)
That's totally understandable.
@gnachman
That could leave the cursor in the position after the last column.
@jerch, I guess this was what I was referring to by "just as the cursor behaves when text reaches the rightmost column of a row" in the original post.
- Execute
height
linefeeds
Umm... shouldn't this be height - 1
? :thinking:
For image decoding, I strongly recommending doing that out-of-process. iTerm2 uses a sandbox like Chrome and there has never been a security issue.
Not sure how feasible that is in a web browser though :thinking:... @jerch, any ideas?
- Execute
height
linefeedsUmm... shouldn't this be
height - 1
? 🤔
Uh, yeah. Code reading fail.
On a sidenote Imho IIP is a much better graphics protocol than sixel, I really dont get why ppl insist on using sixel (besides for historical reasons).
@jerch, in view of the last couple comments, does your stance:
Or to say it differently - yes I deviate here intentionally from what iTerm implements for its sequence, but since its not documented behavior, it cannot be expected as such.
remain the same?
Hello!
I recently realised the same cursor placement policies for sixels are used for IIP, by this implementation. This causes the behaviour for IIP to differ significantly (IMHO) from the reference implementation and every other implementation I'm aware of.
I'm aware cursor positioning is not specified in the document but I believe in such cases, the most sane choice is to comply with the reference implementation (particularly when the behaviour makes sense and is also followed by other implementations).
Deviations such as this cause application developers to introduce workarounds or special cases for such TEs, which isn't the best. See https://github.com/AnonymouX47/termvisage/issues/9.
It'd be really good if cursor placement worked as expected i.e immediately to the right of the bottom-right-most cell touched by the image, without advancing the cursor to the next row when the images reaches the rightmost column of the screen (i.e just as the cursor behaves when text reaches the rightmost column of a row).
In addition, I think the
doNotMoveCursor
extension (implemented by Wezterm and Konsole, if I recall correctly) would be a worthy addition, if feasible. See https://github.com/wez/wezterm/pull/1433 (by Autumn Lamonte, author of Jexer). The Wezterm docs states:Thank you. :pray:
For better context as to why I said "significantly" above:
In Wezterm:
https://github.com/jerch/xterm-addon-image/assets/61663146/e0277a75-dc79-47a3-95a9-d2618bf244b2
In xterm.js (via ttyd) (portions of (and in other cases, entire) images get overwritten by the TUI framework due to the misaligned cursor placements):
https://github.com/jerch/xterm-addon-image/assets/61663146/2c1faae7-0e3a-4a2f-9b8c-069240d2e3f1
For the record, this is the same application using unicode blocks to display images (to show that the issue is not an incompatibility between the framework and xterm.js):
https://github.com/jerch/xterm-addon-image/assets/61663146/4a0a9585-0eaa-4d2d-9424-d065ab4bb556
Side question:
Are you aware of any means for a program to detect that it's running within xterm.js (and maybe it's version)?
XTVERSION
(CSI > q
) doesn't seem to be implemented and I don't think env vars are viable as I don't think they can be set by xterm.js.