jerch / xterm-addon-image

Image addon for xterm.js
MIT License
51 stars 6 forks source link

Increase speed a thousand fold by caching #41

Open hackerb9 opened 1 year ago

hackerb9 commented 1 year ago

On my little laptop, the current script output rate is about 1.8 KBytes/s, which is nearly equal to the speed of the VT340's serial port (19.2 kbps). That is plenty fast for my VT340+ as the graphics processor is (probably) the bottleneck.

However, the script could be made much faster by caching the calculations in an array. Here is a version that increases the speed from 1.8 KBps to 1.8 MBps, as measured by ./endless.sh | pv -brat > /dev/null.

Code is included, but commented out, that would increase the speed to 88 MBps by caching into a string instead of an array.

If desired, startup time could be shrunk. It takes about three seconds to create the cache during which the output rate is the same as before (1.8 KBps), but that be done more swiftly by moving the for loop within bc instead of forking bc 600 times.

jerch commented 1 year ago

Oh wow - I knew that the process forking costs are still high, but that much? Thats really impressive. :+1:

Sadly this addon does not yet have the endless mode implemented, I still need to reshape the underlying sixel lib. The lib was created with sixel images in mind, thus has a strong (finite) image idea, which drops back on my feet here. :smiley_cat:

hackerb9 commented 1 year ago

Okaydoke. The patch can wait for endless-mode. Maybe I'll look at improving the speed of the initialization in the meantime.

By the way, choosing a period size that is divisible by 6, such as 198 instead of 200, would greatly reduce the number of calculations required for the cache.

j4james commented 1 year ago

I've just been trying this out, and my first impression was that it didn't seem any faster than before, and I thought maybe it was because my implementation was rubbish. But then the cache kicked in... 😁

https://user-images.githubusercontent.com/4181424/208203325-b6f1b770-1d34-44e8-b272-d0a4e7db4b12.mp4

jerch commented 1 year ago

Okaydoke. The patch can wait for endless-mode. Maybe I'll look at improving the speed of the initialization in the meantime.

I'd add the enhanced script whenever you think its done. Also feel free to alter the period size to a better fit, this was only a quick pick to get something to show.

But then the cache kicked in...

This indeed now resembles the idea of an oscilloscope (well xy transposed), and could easily be made into a full incarnation with knobs to tune the shape (maybe with drawing their state "statically" outside of scroll margins). I wonder if there were serious applications in the past using this endless mode for such things...

And while thinking about the sixel cursor progression mechs - it kinda resembles the zigzag beam movement on tube displays/TVs in the past. Would sixel graphics ingestion on a vt340 be fast enough to do something similar to "interlaced" half image screen updates? @hackerb9 You wrote something about double buffering with sixels - maybe thats even capable to play tiny animations by blitting half images from those 2 buffers on the screen fast enough? Idk what the main limiting factor is on a vt340 here - if its the overall input speed then half images would greatly lower the pixel data pressure. Ofc this only works if the update frequency comes close to human perception resolution (imho TV was here somewhere around 25-30 half images per second, depending on the standard used).

hackerb9 commented 1 year ago

I wonder if there were serious applications in the past using this endless mode for such things...

It's quite possible. Before there were dataloggers, "chart recorders" on continuous paper were standard. Usually that would mean plotters with pens, not dot matrix. But, if you already have a minicomputer processing the data and a printer which can speak sixel, it seems like using endless mode would be obvious, especially once terminals were able to display the same protocol: you get a chart recorder and a CRT oscillograph for free. Of course, I admit, I didn't think of using sixels that way until you pointed out the possibility, but there would have been many bright people back in the day.

If I recall correctly, the documentation for the older DEC printers said that sixels were printed as soon as they were received. That is certainly how the VT340 works. Those would have been perfect as chart recorders or oscillographs. [@j4james: Didn't DEC even market a specialized terminal just for drawing graphs before the VT241 shook everything up with sixels?]

However, later DEC printers, like the "256-color" LJ252 may not have been as useful. The LJ252 manual said -- and I paraphrase -- for efficiency, the printer head movement may not correspond with receiving a Graphic New Line (-). On a printer like that, if the data stream were to stop, I believe the final few lines would not be printed until the printer received either a String Terminator (ST) or an error. (The manual emphasizes that, after an error, "the printer prints all stored sixel data before entering text mode", so maybe it wouldn't have been such a big deal after all.)

hackerb9 commented 1 year ago

Would sixel graphics ingestion on a vt340 be fast enough to do something similar to "interlaced" half image screen updates?

Not in general. The serial port seems to be limited to 19,200 bps. Half of the screen is $\frac{800\times 480}{2} = 192,000$ pixels. So, even using sixel's default aspect ratio of 2:1 to get half-resolution, it could still take ten seconds just to transmit the data, presuming a complex image. Of course, we have run-length encoding, so for simple images there would be much less data to transmit, but even if RLE gave us 10:1 compression, that would still be only one frame per second, nowhere close to the 60Hz (50Hz in Europe) frame updates we saw in old analog TV.

@hackerb9 You wrote something about double buffering with sixels - maybe thats even capable to play tiny animations by blitting half images from those 2 buffers on the screen fast enough?

The Adder/Viper graphics chips in the VT340 could certainly handle blitting, but we don't have raw access to control those, as far as I know. Instead, the VT340 has six "Pages" in memory. Anything drawn on a page stays there. Sending Esc[2SpcP, for example, switches to page 2. Note that the VT340 lacks the memory to store sixel data for more than the first two pages.

Idk what the main limiting factor is on a vt340 here - if its the overall input speed then half images would greatly lower the pixel data pressure.

Yes, using half-images (or extreme aspect ratios) would cut some the pixel pressure. Actually, reducing the number of colors makes a huge impact. First, each additional color requires another pass which could require an entire byte to be sent just to represent a single bit. Perhaps even more importantly, with fewer colors, the Run Length Encoding is able to find larger swaths of a single color.

jerch commented 1 year ago

The serial port seems to be limited to 19,200 bps.

Oh right, that is way beyond any serious data transfer for pixel animations. I always tend to forget that the vt340 is a product of the 80s, where serial IO was still in that lowish baud regions.

Sending Esc [ 2 Spc P, for example, switches to page 2.

I guess the memory page switch gets almost immediately applied to the screen output system as well, so no invisible preloading tricks are possible? Is the switch destructive for previous memory content? If not, at least a two frame "animation" could be loaded, and if there is a time gap between switch and screen refresh the content could even be replaced on the fly (well if the time gap is big enough and the graphics small enough to finish loading in that time, and the switch itself has no "hiccups" with a blank screen in between).

j4james commented 1 year ago

Didn't DEC even market a specialized terminal just for drawing graphs before the VT241 shook everything up with sixels?

@hackerb9 Well there was the VT105, which I understand was essentially a VT100 with an add-on waveform-generator module which could produce certain kinds of graphs. But I believe that capability was available even earlier with the VT55, although obviously not then ANSI-compatible.

And then with the VT125 you had Sixel and ReGIS, as well as the ability to emulate VT105 graph mode. Although I'm assuming in that case it wasn't actually using a waveform generator. I'm not sure of the details though.

I guess the memory page switch gets almost immediately applied to the screen output system as well, so no invisible preloading tricks are possible?

@jerch My understanding of the paging extension is that you can write to a background page without making it active, and then only switch the page into view when you're ready. I'm assuming that's the kind of preloading trick you had in mind?

Essentially you turn off the page cursor coupling mode (DECPCCM) before moving to the target page, and that leaves the original page focused. Then when you're ready for the target page to be shown, you turn DECPCCM back on again.

I'm not sure if that trick also works for Sixel, but I would hope to.

hackerb9 commented 1 year ago

Oh right, that is way beyond any serious data transfer for pixel animations. I always tend to forget that the vt340 is a product of the 80s, where serial IO was still in that lowish baud regions.

Yup. I've been wrong before, but I believe the best bitmap animation you'll get out of a genuine VT340 would be using transparency to send just the diffs for each frame, as you did in your "overprint slim" test.

@j4james is correct about being able to write on alternate pages without showing them by turning off cursor coupling, but I can't remember if I even bothered to test it on the VT340 since transmission to the back buffer would be unsuitably slow. It is only on modern terminals, like xtermjs and contour, that page flipping would be useful for tear-free animation.

One thing that has been useful on the VT340 is the second session. When a slow sixel image is coming in, I can hit F3 to switch to the second session; I get another login prompt on a fresh screen and can keep on working. The VT340 has independent framebuffers for each session and I can switch back and forth instantly. The only issue, is that only one colormap can be active on the screen at a time. That means if I show both sessions at once in split screen (Shift+F3), the background session may display with a wacky palette. (That's one of the reasons I started using the default palette in some of my tests instead of defining the colors.)

Of course, I often misremember things, so once I get my system set up properly, I'll have to run some tests. In particular, if my memory is correct, I should be able to split the screen vertically and show the endless sine wave in one half and simultaneously run a completely different sixel program in the other. I'd like to verify that.

jerch commented 1 year ago

I'm assuming that's the kind of preloading trick you had in mind?

Yes, thats at least something, although still quite limited due to only spanning over 2 pages.

The only issue, is that only one colormap can be active on the screen at a time

Hmm then this is prolly a hardware restriction of the screen output system in general, I'd guess. Still split palettes should be possible with that (whether thats of any use with only 16 colors in total is a different question lol).

hackerb9 commented 1 year ago

I'm assuming that's the kind of preloading trick you had in mind?

Yes, thats at least something, although still quite limited due to only spanning over 2 pages.

Two is usually good enough for double-buffering to prevent screen ripping: you always display the buffer you're not writing to and flip every time you finish drawing. I just tested it and you are indeed able to view one sixel image while another is loading in the background on a different page.

The VT340 technically has more than two pages. I think by default it is configured to six pages, each of size 80x24. However, I believe it only came with enough RAM to hold two 800x480 sixel screens.

The only issue, is that only one colormap can be active on the screen at a time

Hmm then this is prolly a hardware restriction of the screen output system in general, I'd guess. Still split palettes should be possible with that (whether thats of any use with only 16 colors in total is a different question lol).

Yes, definitely a restriction peculiar to the VT340 hardware. Splitting the palette hadn't occurred to me. It would be a little bit tricky since the VT340 doesn't let you actually assign to a specific sixel color register: no matter what number it is called, the VT340 assigns them in the order seen. And it starts over from 1 every time it sees a new sixel image.

My workaround has been to just use the default palette when possible.

jerch commented 1 year ago

However, I believe it only came with enough RAM to hold two 800x480 sixel screens.

Well it is still the 80ies, where DEC placed squaremeters of memory boards in its high end machines. Maybe Cray or IBM already had higher density modules, if so it prolly was not yet worth the deal in the "bread and butter" terminals. It just would have made them even more expensive and prolly cut into sales badly.

no matter what number it is called, the VT340 assigns them in the order seen. And it starts over from 1 every time it sees a new sixel image.

Hmm, so the "addressing" is not really respected? What would happen, if you change the sixel palette in the second page to totally different colors, while page one shows sixels? Would they immediately get re-colored? If not then it would give you effectively 2x16 colors. If it gets re-colored, well then a "conservative color change" in the pre-processing might help - change only color slots not currently in use, keep others stable. A 2x8 split would be the extreme variant of that, where color slots would only change in frame modulo 2.

j4james commented 1 year ago

It would be a little bit tricky since the VT340 doesn't let you actually assign to a specific sixel color register

Technically you could set the palette with DECCTR, and that would give you direct access to individual palette entries. It wouldn't work on most modern terminals, though, since they tend not to share the text and graphics palettes.

But if your use case is something like a two frame animation, you're probably better off just coming up with a custom 16-color palette that works best for both frames. That's got to be better than limiting them to 8 colors each, or having to make do with the default palette.

And for something like a slide show, where you're loading a completely different image in the background, I would just put the palette at the end of the image, so it only changes immediately before the page flip. There may be a brief flash with the wrong palette, but I'd still consider that better than halving the available colors.

hackerb9 commented 1 year ago

@jerch writes

no matter what number it is called, the VT340 assigns them in the order seen. And it starts over from 1 every time it sees a new sixel image.

Hmm, so the "addressing" is not really respected?

Correct. The color numbers in a sixel file have nothing to do with which color register it is actually assigned to. The only way I know to address a sixel color register directly is to use DECRSTS (Reset Terminal State). However, I don't like it because the manual explicitly says it is not portable between different terminals.

What would happen, if you change the sixel palette in the second page to totally different colors, while page one shows sixels? Would they immediately get re-colored?

Yes, it is recolored immediately, but as @j4james mentioned, you can set the palette after sending the off-screen pixels.

@j4james writes

It would be a little bit tricky since the VT340 doesn't let you actually assign to a specific sixel color register

Technically you could set the palette with DECCTR, and that would give you direct access to individual palette entries. It wouldn't work on most modern terminals, though, since they tend not to share the text and graphics palettes.

Oh, I don't think I knew about DECCTR. I clearly haven't read the Text Programming volume sufficiently. I remember DEC did some nifty things with modulo arithmetic, but would a sixel file made using this technique for, say, the VT241 display correctly on a VT340? Maybe vice-versa?

I would just put the palette at the end of the image, so it only changes immediately before the page flip. There may be a brief flash with the wrong palette, but I'd still consider that better than halving the available colors.

I believe that this insight is the correct solution.

The switch in colormap does take noticeable time, maybe a tenth of a second, so there will be a flash. It is very rapid, but I seem to recall that you can perceive the screen updating from top to bottom for each color and the more colors that are changed, the slower it is.

j4james commented 1 year ago

I don't like it because the manual explicitly says it is not portable between different terminals.

That bit about it not being portable doesn't apply to the color report though. It can be confusing because of the weird naming. I think that's because there was just the one "terminal state report" originally (and that's the bit that's not portable), and the "color table report" was only added later (that has a well-defined format). They both use the same underlying sequences.

So you've got DECRQTSR for requesting a report, and DECRSTS for restoring the report, but there are two supported report types (chosen by the first parameter). Originally we just had the terminal state report (DECTSR), which is the basis for the sequence names, and only later we got the color table report (DECCTR).

I remember DEC did some nifty things with modulo arithmetic, but would a sixel file made using this technique for, say, the VT241 display correctly on a VT340? Maybe vice-versa?

Not all VT240 images are fully forward compatible with the VT340, but with certain limitations they could be. You'd have to enable DECSDM to replicate the VT240 limit of always drawing in the top left corner, and you'd also need to restrict yourself to three colors. Alternatively you could pad out the color table to 16 colors and replicate the fourth color in last slot (the background color).

I think there are a few other undocumented features of the VT240 that won't work on the VT340, but they're less likely to be used.

Working backwards from a VT340 is more problematic, but again you could make images reasonably backwards compatible. You'd have to limit yourself to a 2:1 aspect ratio, and repeat the background color in the fourth color slot. But any additional colors beyond the first four should map automatically with a closest match algorithm. Obviously there's going to be a loss of color resolution though.

Although now I'm reading your question again, I'm not sure if you were perhaps asking about DECCTR being supported on the VT240. If so, the answer is no (none of the DECRQTSR sequences are). The only terminals I'm aware of that supported DECCTR were the VT340 and the VT525.