Closed amano-kenji closed 1 year ago
I doubt it's possible to support images because this requires knowledge of how many pixels wide a column is. I don't know of a way to find that out since it depends on the terminal emulator's font characteristics which as far as I know are not available to the application. Without cropping, images cannot be displayed in Vty because the image format assumes cropping is possible.
I think sixel can obtain knowledge of how many pixels fit in a character cell horizontally and vertically.
Without cropping, images cannot be displayed in Vty because the image format assumes cropping is possible.
In theory, Crop
data constructor can carry the cropping information until the last moment, and ImagePixels
data constructor can render the actual visible portion of image pixels at the last moment and print it out to the terminal.
Is it possible to carry cropping information Crop
data constructor until the last moment for ImagePixels
data constructor? I imagine this is a matter of introducing case
statements for different data constructors.
With knowledge of how many pixels fit in a character cell and how an image fills the allocated character cells(fully visible, fit vertically, fit horizontally, fill all character cells, ...), it may be possible to crop image pixels by character cells, but it's still desirable to delay actual cropping of image pixels until the last moment for performance and simplicity of implementation.
I think sixel can obtain knowledge of how many pixels fit in a character cell horizontally and vertically.
I was going to ask you to substantiate this, but I am realizing that I am not really interested in digging into this further. I think that if you wanted to explore whether Vty could facilitate display of images in the terminal, then a better place to start is to survey the available protocols for doing so. For instance, I found this which is an example of an alternative; the question is, what alternatives exist, and how widely-supported are they?
Sixel is a very old protocol and surely better options exist now. Are you interested in investigating?
Also, I wanted to mention this since I have seen it happen several times: I've noticed that you've posted comments on tickets only to either heavily edit them or delete them. That ends up being confusing to me, since I get email notifications about your comments only to come to Github's web site and find that the comments have either vanished or changed considerably. Please consider slowing down and being sure about what comment you want to post. If you want to amend it, that's fine, but please do so by posting a new comment asking me to ignore a previous one rather than deleting it.
since I get email notifications about your comments only to come to Github's web site and find that the comments have either vanished or changed considerably.
Sorry about that. I didn't know that you were getting real-time email notifications. My mind works quite frantically once it kicks in. I have been bad at tying dirty loose ends in my mind once thinking starts. I have yet to develop a systematic thinking process that keeps things tidy in my mental landscape. I will try to hide my thinking process from you.
Sixel is a very old protocol and surely better options exist now. Are you interested in investigating?
Upon further consideration, I realized that if I know how many pixels fit in a character cell, then I can create an Image of image pixels that can be cropped cell by cell on the fly.
But, encoding image pixels into sixel escape codes, kitty escape codes, or osc1337(iTerm2) escape codes should happen when cropped image pixels are actually rendered at the last step.
Is there a function where cropped image pixels can be encoded into image protocol escape codes before they are sent to the terminal?
Is there also a way to know where image pixels are supposed to be rendered?
If sixel doesn't know where it should start rendering, it must enable scroll mode to start rendering from the current cursor. In scroll mode, the bottom 6 pixels of a terminal emulator cannot be rendered by sixel.
If sixel knows where to start rendering, it can render at the bottom 6 pixels of a terminal emulator.
Upon further consideration, I realized that if I know how many pixels fit in a character cell, then I can create an Image of image pixels that can be cropped cell by cell on the fly.
Yes, that would ultimately be something we'd need to be able to accomplish. But this is still premature, because I asked whether you'd be willing to investigate other image formats. Perhaps if you found other formats, those formats might provide some idea of whether this is solvable.
But, encoding image pixels into sixel escape codes, kitty escape codes, or osc1337(iTerm2) escape codes should happen when cropped image pixels are actually rendered at the last step.
Yes, this is true, as is already the case in Vty when images get converted to a byte stream of output. The "pixel data" case would be done similarly, if we get that far.
Is there a function where cropped image pixels can be encoded into image protocol escape codes before they are sent to the terminal?
Well, yes and no: as we already established, Vty does not support pixel-based images so the current functionality only works on character cells. In addition, Vty applies a diffing process to the output to ensure that only changed terminal cells are updated in the output. That's something that would also ultimately need to be considered in the pixel image case.
Is there also a way to know where image pixels are supposed to be rendered?
If sixel doesn't know where it should start rendering, it must enable scroll mode to start rendering from the current cursor. In scroll mode, the bottom 6 pixels of a terminal emulator cannot be rendered by sixel.
If sixel knows where to start rendering, it can render at the bottom 6 pixels of a terminal emulator.
These seem like questions we ought to discuss later once we've determined whether it's feasible to have images in vty at all. I'm not yet convinced that it is possible. Here is what I would like you to investigate first before considering any others, if you are willing: please identify alternative image formats for displaying images in terminals, and present what you find here. I am happy to take a look at the resources you find in order to help evaluate whether any of them will be a fit for Vty.
I better start researching this topic again after publishing brick-tabular-list.
Before I go, I want to quickly let you know that it's possible to calculate pixels per cell width or cell height.
#!/usr/bin/env python
import array, fcntl, sys, termios
buf = array.array('H', [0, 0, 0, 0])
fcntl.ioctl(sys.stdout, termios.TIOCGWINSZ, buf)
print((
'number of rows: {} number of columns: {}'
' screen width: {} screen height: {}').format(*buf))
print("Pixels per cell width: {}".format(buf[2]/buf[1]))
print("Pixels per cell height: {}".format(buf[3]/buf[0]))
Example output
number of rows: 57 number of columns: 273 screen width: 1911 screen height: 969
Pixels per cell width: 7.0
Pixels per cell height: 17.0
At least, cells contain whole numbers of pixels instead of fractional numbers. I would have been disappointed if pixels per width or height were something like 6.5 or 3.14.
I finished documenting brick-tabular-list. I'm just waiting for my hackage account to be accepted into uploaders group.
So, brick-tabular-list can be considered finalized.
I read about sixel, iTerm2 image protocol, and kitty image protocol.
After considering how vty works, I concluded that a 24-bit RGB vty image should fully fill character cells and should not allow alpha channel. With this assumption, all three image protocols can be trivially supported, and any future image protocol that makes the current terminal image protocols obsolete will be supported.
If a graphical vty image fills a character cell partially or has alpha channel, then the current design of vty cannot calculate which character cells should be updated. Think about partially transparent images above text. If a character cell is filled partially or a graphical image has alpha channel, then vty will have to calculate whether pixels have to be updated. vty was designed to calculate which character cells should be updated.
I came up with the following imaginative data constructor for Image.
ImagePixels {
pixels :: 24bit-pixels
, hashes :: [[HashForEachCharacterCell]]
, width :: Int
, height :: Int
, renderingFn :: ... -> ... -> IO ()
}
Since each character cell has a fixed number of pixels, a hash can be calculated for the image content in a character cell.
When vty crops ImagePixels
, it will crop hashes and pixels and subtract from width or height.
renderingFn
will be provided by vty-sixel, vty-kitty, or vty-any-image-protocol.
After vty renders ImagePixels
, it records pixel hashes for character cells. I got this idea from bittorrent file chunk hashes. vty will compare hashes to calculate which character cells should be updated.
How do you think about this?
Thanks for doing some more research and putting some thought into this.
a 24-bit RGB vty image should fully fill character cells and should not allow alpha channel
I think this makes sense.
ImagePixels {
pixels :: 24bit-pixels
, hashes :: [[HashForEachCharacterCell]]
, width :: Int
, height :: Int
, renderingFn :: ... -> ... -> IO ()
}
My thoughts on this:
Image
results in unnecessary tight coupling between the image and the implementation.Overall, before I'm ready to talk about how to implement this, I need to understand more about the protocols that we have to choose from. I can't evaluate an implementation until I know how the encoding(s) work. This continues to be my request of you: look at the existing encoding options, present what you find here, and then let's discuss what is feasible. It's a bit too early to discuss implementation.
swaybg
defines the following scaling modes // stretch, fill, fit, center, and tile. sxiv
supports 100% scale, fit large images into window, fit image to window, fit image to window width, and fit image to window height.Vector (Vector Hash)
in real code.We can already determine how many pixels fit in a character cell.
How?
Is vty going to directly implement terminal image protocols?
Yes. Vty will be responsible for taking the abstract Image
and rasterizing it into escape sequences and output data, the same way it already does so for colored text, style data, etc. So naturally it will need to take care of emitting the appropriate cursor movements and escape sequences to update the right cells on the screen.
ioctl
.I see, thanks for sharing that link. Assuming the window size in pixels is reported without the non-cell pixel area, those numbers should be okay. (Many terminal emulators are slightly larger than a perfect multiple of the cell pixel size.) The other consideration is that an image that is less than an exact pixel multiple of the cell size would use a fraction of a cell when rendered.
I was envisioning a future where vty defines a framework for image protocols
I'm not that interested in making the rendering backend for Vty pluggable, which is essentially what this means. But I'm going to say this over and over again: this is an implementation consideration that it's too early to worry about because I still don't know which protocols you're recommending supporting. What I want to know is:
If vty had to support all image protocols internally
I don't want this, so this isn't a future I'd worry about. :) This is why I've asked you to provide info about the existing protocols. If we're going to support this in Vty, then I want to support whichever protocol is most widely-deployed in terminal emulators. If there is a clear winner, that's the one I'd rather support. (And "winner" here means widely-supported by existing emulators already and not baroque in its design.) If that turns out to be hard because there are too many protocols or they're all poorly-supported or fail badly in terminal emulators that don't know about them, then I would rather not add the feature at all. I don't think there's enough demand for images in Vty to justify the headache.
Judging by your comment above about standards, it doesn't sound like there really is any (and sixel is not looking like a good contender).
an image that is less than an exact pixel multiple of the cell size would use a fraction of a cell when rendered.
I think the only thing that makes sense is to just scale or shrink image pixels to the given width and height in character cells. You pick a size in character cells, and image pixels will be scaled or shrunk to fill the given size.
Whatever image protocol vty ends up implementing without third party libraries, I think this issue is for determining whether vty can have any image protocol without being slow.
To determine performance characteristics, I need to ask a few questions.
I'm closing this issue since this discussion is not working for me. As the vty maintainer, it's important to me to evaluate all aspects of a request or proposal. While I understand that you want this feature, I am going to decline to do any further investigation of it for now. In the future, it would be very helpful when you engage with me on my libraries not to presume what the investigation should or shouldn't be about. I want to help, but it's very difficult to do that when you insist on a framing of an approach that I've repeatedly asked to change.
I thought that it might be helpful to outline how I will want to engage on any library feature if you end up wanting to open more tickets about new ideas.
With all that said, while I am happy to consider patches, when it comes to developing a design, abstraction, and implementation, I need that to be a collaborative process. As the person most familiar with the code and with a design vision for the library, I'm going to need to influence that. And if we can't reach a conclusion that I think will serve the library, then I cannot move forward, especially if it's code that I need to be willing to maintain.
I hope all that is helpful next time you would like to open an issue to discuss something like the topic we discussed here.
Is vty compatible with an image protocol like sixel and kitty and OSC1337?
Since image pixels are different from characters, they cannot be cropped on the fly.
I imagine that an Image for image pixels should contain its size in columns and rows, image pixels, and cropping information so that when the image is rendered, only the cropped portion of the Image size is rendered before being passed to the terminal.
Can a data constructor of Image contain size in character cells and image pixels? If it can, then any image protocol can be implemented as third party packages like vty-sixel, vty-kitty, and vty-osc1337.
I imagine an Image data constructor for images can look like this.
This can be wrapped in
Crop
data constructor. For vty-sixel, sixel renderer kicks in after all the cropping is done toImagePixels
data constructor before sending image data to the terminal.I don't know internal details of vty. Do you think image protocols can be implemented in vty?