wez / wezterm

A GPU-accelerated cross-platform terminal emulator and multiplexer written by @wez and implemented in Rust
https://wezfurlong.org/wezterm/
Other
16.3k stars 719 forks source link

Kitty Image Protocol Support #986

Open wez opened 3 years ago

wez commented 3 years ago

This issue is tracking the status of supporting Kitty's image protocol.

Spec at: https://sw.kovidgoyal.net/kitty/graphics-protocol/

Using it:

Enable the protocol by setting enable_kitty_graphics=true in your config.

Known conformance issues

kovidgoyal commented 3 years ago

I am happy to see this, please don't hesitate to ping me if you have any questions.

wez commented 3 years ago

I am happy to see this, please don't hesitate to ping me if you have any questions.

Great to hear! Do you have a set of tests or similar that could be made to run against wezterm to sanity check conformance? The surface area of the spec is quite large and I'm sure I'm going to overlook something. My game plan is try running @dankamongmen's notcurses demo when there's enough of an implementation working and see if anything looks egregiously bad.

kovidgoyal commented 3 years ago

I do have unit tests in kitty itself, for individual bits of functionality from the protocol, but I doubt they can be easily lifted for another terminal emulator. If you wish to have a look, please see kitty_tests/graphics.py

I would actually be happy to collaborate to make those (and other) tests defined in a terminal independent fashion so anyone implementing the protocol could use them. Perhaps a simple txt or json based format that specifies the input as escape codes and the output as a set of image placement data or similar.

dankamongmen commented 3 years ago

I am happy to see this, please don't hesitate to ping me if you have any questions.

Great to hear! Do you have a set of tests or similar that could be made to run against wezterm to sanity check conformance? The surface area of the spec is quite large and I'm sure I'm going to overlook something. My game plan is try running @dankamongmen's notcurses demo when there's enough of an implementation working and see if anything looks egregiously bad.

first off, i enthusiastically support this move. the kitty protocol is (at least for my purposes) a tremendous improvement upon both sixel and iterm. it simplifies things (no more requirement that one draw in terms of 6 rows, sane deletion behavior), improves performance in several areas (fast moves, fast changes to drawn images), and makes certain things possible that otherwise are not (text drawn atop bitmaps without killing entire cells). relative to iterm2, it's much more flexible and powerful. at the same time, it currently shows worse local performance in Notcurses due to (a) greater bandwidth demands than sixel and (b) time spent in zlib. i expect to be able to hide the latter in most applications via threading. it would be possible to reduce the local bandwidth via using the filesystem as a side channel to load images, but i have not yet embraced this, and might never do so (but might, who knows).

right now Notcurses selects the Kitty protocol strictly based on heuristics, as opposed to doing the recommended query. I intend to add support for the latter (and really ought have by now), but hadn't bothered since no one else had implemented it (and also because said query frustratingly doesn't let you determine the version of the kitty protocol supported--as you note, it's a large protocol). you can ensure kitty graphics are being used by running notcurses-info:

2021-07-29-005228_1083x620_scrot

where it says "rgba pixel animation support", you want that. wezterm currently says "sixel graphics" or "iterm graphics", i forget which one. if it says "rgba ....", you're driving kitty graphics. i'll try to add this query soon.

Notcurses uses a pretty wide subset of the protocol, and indeed motivated/proposed some of it. Among the elements it exercises are:

i do not currently exercise: sideloading images via the filesystem, managed animations, scaling, or z-indices other than 0, though i expect to start using the last Really Soon Now.

notcurses-tester and ncplayer -bpixel will also effectively test portions of your implementation.

if you have any questions, don't hesitate to ask me; i reckon i know more about the kitty graphics protocol than anyone living save @kovidgoyal himself. happy hacking!

dankamongmen commented 3 years ago

if you have any questions, don't hesitate to ask me; i reckon i know more about the kitty graphics protocol than anyone living save @kovidgoyal himself. happy hacking!

oh and let me add that i make use of kitty's honoring of transparency values in RGBA (at at least a bimodal level), just as I do in iterm, and using unspecified pixels in sixel when P2=1 in sixel.

kovidgoyal commented 3 years ago

@dankamongmen regarding using heuristics for detection, because you dont have a version. You shouldnt need a version. You can create a dummy image and try to perform every operation you want on it and see if it succeeds, i.e. the terminal does not respond with an error. Thus you can know exactly what the terminal you are running in supports.

And of course as am sure you already know, if you really want a version us XTVERSION or XTGETTCAP

dankamongmen commented 3 years ago

And of course as am sure you already know, if you really want a version us XTVERSION or XTGETTCAP

this is exactly what i'm doing, see https://github.com/dankamongmen/notcurses/blob/master/src/lib/termdesc.c#L501-L527

@dankamongmen regarding using heuristics for detection, because you dont have a version. You shouldnt need a version. You can create a dummy image and try to perform every operation you want on it and see if it succeeds, i.e. the terminal does not respond with an error. Thus you can know exactly what the terminal you are running in supports.

yep, i could do that. what i've got now works for me pretty well, though.

and let it be known in all lands the sun touches: i wholeheartedly affirm the use of the Kitty graphics protocol over others. indeed, i sat down a few weeks ago to describe my ideal terminal graphics protocol, and ended up with something very similar to Kitty's.

i will be adding support for coarse kitty graphics support very shortly, probably tonight, in https://github.com/dankamongmen/notcurses/issues/1998.

stevenxxiu commented 2 years ago

@wez I'm trying to use Broot in WezTerm and opened an issue there. Mind taking a look at some problems in https://github.com/Canop/broot/issues/473#issuecomment-1005995882 with the protocol support?

Kitty's spec specifies that "The image will be scaled (enlarged/shrunk) as needed to fit the specified area" and Wezterm doesn't seem to respect that.

I've also another problem: the only way to correctly detect the support without having problems on some terminals seems to be to read an env var and there's no guarantee wezterm sets it in an adequate way.

Canop commented 2 years ago

Everything is fine now, Broot 1.9.1 displays high resolution images in WezTerm:

image

AnonymouX47 commented 2 years ago

Hello!

First of all, great work here and I do recognize that there's no claim that the terminal fully supports the kitty graphics protocol yet.

I'm currently working on a project that uses this protocol to display images (See https://github.com/AnonymouX47/term-image/issues/40). So far, everything I've implemented is working fine on Kitty.. and then I decided to try it out on other terminal emulators that were mentioned to support the protocol, Werzterm being the first.

From my short try with Werzterm (nightly build, downloaded few hours before sending this message), I've encountered a number of crippling (for my use case) inconsistencies with the protocol's specifications (or behaviour in kitty). Here are a few:

1. Every image transmited without an ID is replaced by the next image also without an ID.

According to the specification:

You can either simultaneously transmit and display an image using the action a=T, or first transmit the image with a id, such as i=10 and then display it with a=p,i=10 which will display the previously transmitted image at the current cursor position.

emphasis on the "or" From my understanding, this means a=T without an ID should simply place the image without attaching it to an ID... which doesn't seem like what Wezterm does.

Trying a=T on kitty displayed as many images as I tried (a lot) without deleting anyone... but that was not the case with Wezterm.

2. Control codes c and r don't exactly work as specified.

If c and r are in relative proportion to the image, it'll scale the image... else it crops the image. (See the images below)

image Scaled

image Cropped (and the scale is even way off)

image Cropped

3. Cursor advancement

According to the specs:

After placing an image on the screen the cursor must be moved to the right by the number of cols in the image placement rectangle and down by the number of rows in the image placement rectangle.

Trying this without C=1 (which Wezterm doesn't support yet) and without the cursor reaching either edge of the window:

Can't readily provide images now but I'm pretty sure those were the results


Please, keep in mind that inconsistency in interpretation and implementation of protocols is one of the major reasons we're where we are with terminal emulators today.

Thanks for your audience and I do hope you look into these and more... Keep up the great work.

AnonymouX47 commented 2 years ago

By the way, here are the results of c and r with kitty (0.25.0):

image

image

image

All properly scaled.

AnonymouX47 commented 2 years ago

@wez 👆🏾

wez commented 2 years ago

I saw this, and I appreciate the detail in your comments, but haven't had time to look at it. If you're eager to see this improved more quickly then I would be happy to accept a PR!

Otherwise, I'd like to remind you that this is free software that I hack on in my spare time, and I have a lot of demands on my spare time.

AnonymouX47 commented 2 years ago

Oh! I totally understand that, I was only expecting some form of acknowledgement (of the comment).

It's not urgent for me as I personally don't use Wezterm, I was only testing out my project in different terminal emulators and decided to report bugs/issues I found (just like https://github.com/kovidgoyal/kitty/issues/5081).

Thanks for your response.

dholth commented 1 year ago

Do I remember correctly that wezterm used to have a "cat image to screen" utility? Where's a good one?

AnonymouX47 commented 1 year ago

The default tool is the imgcat subcommand. As described in the docs:

To render an image inline in your terminal:

$ wezterm imgcat /path/to/image.png

Also, you can use term-image which provides an interactive user interface for browsing/viewing images... still under developmenr though.

Disclosure: I might be biased about the latter :) but feel free to judge for yourself.

There are a number of other tools out there that also perform similar tasks e.g timg.

bew commented 1 year ago

The latest release of Kitty added an interesting feature that allow displaying images without explicit support from a program, using unicode chars, diacritics and fg color to select and position an image. Refs:

AnonymouX47 commented 1 year ago

A wezterm fork (by the author of the feature) that implements support for the feature: https://github.com/sergei-grechanik/wezterm/tree/unicode-placeholders

Might make sense to ask the author to open a PR.

AndydeCleyre commented 1 year ago

Assuming that adding support for Kitty's unicode placeholder model means Wezterm + tmux can support the Kitty image protocol, can we get a todo-checkbox for it up top here?

constantitus commented 9 months ago

I'm trying to find a good way to display images in neovim, but most plugins use the kitty protocol, which doesn't seem to work as intended on wezterm. What would cause this discrepancy between kitty and wezterm ?

edluffy/hologram.nvim:

https://github.com/wez/wezterm/assets/118078873/0412c0be-ae5f-44fb-9d9d-122ac94b98c9

Edit: Another plugin, another issue

3rd/image.nvim:

https://github.com/wez/wezterm/assets/118078873/0a0b7d13-b603-4011-94fa-bb7ded44b107