directvt / vtm

Text-based desktop environment
MIT License
1.57k stars 43 forks source link

Pixel graphics in terminal #164

Open o-sdn-o opened 2 years ago

o-sdn-o commented 2 years ago

The terminal must operate only at the cell level and have access to any cell without harming neighboring cells. There should not be any inseparable blocks like wide characters.

The cell can contain anything:

In order to allow the application to output bitmap images to the terminal, and for the terminal, in turn, it is convenient to store them in a scrollback, reflow and render, I propose to operate with hashes of fragments of these images as the content for cells of the "image" type. When displaying such a cell, the raster is always stretched to cover the entire cell. If the application somehow knows the size of the cell in pixels, then it can display images one-to-one, if not known, then the application must follow that the aspect ratio of the cell is 1:2.

struct hint
{
  // Neighbor pixels
  //    p1  
  // p2 p0 p3
  //    p4
  //
  // interpolation type (four 2-bit values):
  // 0 -- Nearest neighbor
  // 1 -- Bilinear
  // 2 -- Bicubic
  // 3 -- Reserved
  //
  // 0 1 2 3 4 5 6 7
  // │ │ │ │ │ │ └─┴── interpolation type between `p0` and `p4`
  // │ │ │ │ └─┴────── interpolation type between `p0` and `p3`
  // │ │ └─┴────────── interpolation type between `p0` and `p2`
  // └─┴────────────── interpolation type between `p0` and `p1`

  byte interpolation_vector;
};

struct glyf
{
  ... ordinary cell data

  // Using an interpolation_vector for text cells
  // will allow the text cells to more smoothly integrate
  // with each other using a gradient and continuously
  // flow into the adjacent image.
  byte interpolation_vector;
};

struct tile
{
  byte width;
  byte height;
  std::vector<std::pair<rgba, hint>> pixels;
};

struct cell
{
  type cell_data_type; // Normal text or image fragment
  union
  {
    glyf text_data;
    hash tile_id;
    ... another cell type data
  } data;

};

std::unordered_map<hash, tile> tiles;

This approach also covers the following use cases:

Usage

To display an image, the application outputs at the current cursor position the following sequence:

ESC_VT1 _hash_tile_id_ ST

If the terminal does not know about such a _hash_tileid, it asynchronously asks the application about this ID:

ESC_VT1 ? _hash_tile_id_ ST

The application must respond with content (if the application does not respond, then the terminal displays a stub in this cell until it learns about this _hash_tileid).

ESC_VT2 _hash_tile_id_ ; <plain_text=0 | base64=1> ; _encoded_tile_data_ ST

To avoid such questions from the terminal, the application can tell the terminal in advance about the displayed _hash_tileid. The terminal will store them until the cells with these IDs leave the scrollback (or use some sort of ring buffer where the user can specify a limit).

ghost commented 2 years ago

I love that you are looking into this! :) It will be so exciting seeing what you come up with.

Some references that you DON'T have to care about, you do your own thing! :-) If you choose to support sixel for the user-facing terminal, there are two very fast and interesting encoders: * [Chafa](https://github.com/hpjansson/chafa/issues/27#issuecomment-647584817) * [notcurses](https://github.com/dankamongmen/notcurses/issues/2503) On mixing images and cells: * [A text cell model that can multiplex images](https://gitlab.com/klamonte/jexer/-/wikis/high-level-design#text-cells) * [Translucency with images](https://gitlab.com/klamonte/jexer/-/issues/88) - Which BTW was partially inspired by vtm. :-) * [A trick to get sixel images on the bottom row](https://gitlab.com/klamonte/jexer/-/issues/91) - Not needed for non-sixel formats

💗

o-sdn-o commented 2 years ago

Thanks for the links! I need to read it and get into it. 🙂

o-sdn-o commented 2 years ago

I doubt the necessity of using multiple layers in terminals.

Offhand, I can't imagine a scenario where an image should be on top of text, except for drawing a graphical mouse cursor moving with pixel precision, as well as a scenario for drawing a stack of translucent UI windows with images on top of text.

Since pixel accuracy is not available to us (we operate at the cell level, and the cell size is unknown), the case with a graphic mouse cursor is not suitable.

The scenario with drawing a stack of translucent UI windows with images on top of the text also does not suit us, since we do not draw glyphs on top of each other, we only draw the topmost glyph, so why should the glyph be visible through the space character, but not through the rest of the glyphs?

All the content that I see in the GUI or in the web browser - everywhere the text is on top of the images.

So far, I see only useful use of a solid background, or an image and text on top of that. Where a solid background is just a single pixel image.

ghost commented 2 years ago

With the caveat that you should do your thing first, always, so I only explain some of my motivation but do not want to influence wherever you are going :) -- I am aiming to develop some gaming capabilities for the terminal.

o-sdn-o commented 2 years ago

Doom with a lower resolution will probably be in vtm, I have been dreaming about it for a long time 😁

doom  03

AutumnMeowMeow commented 2 years ago

"subscribe" (Hello from my new account. :) )