Closed dankamongmen closed 3 years ago
Ahhh, I think the performance advantage is coming from damage tracking internal to the media. That makes total sense. Hrmmmmm. That will not be trivial to fit into our model, though...I think we needed something like this for #1360, right?
Alright, so if we brought the previous bitmap into sixel_blit
, we'd leave out any pixels which matched in their sixelspace reduction. Makes sense.
but that can't be all. We'd expect our first frames to look roughly equivalent, but they are not. Hrmm, let's limit comparison to the first frame...
ahhh yes, i noticed this the other day -- we emit a band for every color, even if the color isn't yet used. of course. let's strip those and see how things improve.
@dnkl i've cut out empty bands, and it cut the total output per frame by about 40%. we still emit longer chunks than mpv
, because mpv
is using more colors, so it has shorter chunks of the same color. our total line lengths ought be about the same -- i'm looking into this now. either way, a big win in terms of output size, thanks for the heads-up!
Ahhh, I think the performance advantage is coming from damage tracking internal to the media. That makes total sense. Hrmmmmm. That will not be trivial to fit into our model, though...I think we needed something like this for #1360, right? Alright, so if we brought the previous bitmap into
sixel_blit
, we'd leave out any pixels which matched in their sixelspace reduction. Makes sense.
erp...damage tracking in this way wouldn't play nicely with moving the image :(
for our first band on the first frame of fm6.wmv
, i get
#32!362?A!27?A!55?B@!33?_?K[Kkk!15{w!14o!12?F!9?@!5B@!18?@BB@!3B!9F!3B@BB@!4?B!5~nNnF!17B!12F!8B!4?FN!4^!136~B!11nf!13?u!46~!10^!252~$
followed by
#33}!461?G[WG!8WG[!7^RB!3R!15BF!14N^!11NWF!3?BB!3FE!5CE!13F!5BACCE!3C?!8G!3CECCE@??F[!5?!3OG!17C!3?!8G?!8C!4F!142?{!11OW!13~H$
are these the same length?
ok i don't understand what mpv
is doing exactly. here's some sample output:
#249!5?H#116G#255???A@#159?@#194??A!27?A?@$#151!255?!255?!80?A
there seem to be bands of very different length here. #116G
is what, the very first column? really? don't you need '$' between colors? no?
mpv
is just using libsixel
, so let's take a look at that.
finding out some very interesting things about libsixel...it seems to emit colors which it never uses? and they definitely reproduce colors within a band...very interesting.
here's a big-ass chunk of libsixel's first encoded frame
#103A~Vlj\V|j\vlU|f|J}t^j}V|vmZ|Vmv\vi^uLzmZulV\f|J|vLzulV|Uj|VlyVlztmZvmtnZu^tnYnT~Uj|nQ|E~ShuLzU~TnYnlZvlv\RxiNzmDvU|R~UlraviXSLXv~iLfqvYHt^Tb]TlOlitgtgtlst\vs~d\XoeGThPUHzawfwhpDhITTLoLdHk`TWJPEpLoxDWlplTHThAlUjiTk@U@SdIDwDh?RGbGfGdCbC@GHSDNoMOdO@s@tIhFGTlHuTeHYhQHYJAPcHcXQHcHq`ScHCGB?]`[`ChAj?@?B_`CB?`O?D?dAW@_?sGM_CONoQcP_K?`O_??K`GSi?Rc@AgP?OdIO`?S@IL?SCHO?hQKP?CdOk@W_L?_D?KPCO@CAGA@_???_?OC?S_O?O_?O?__G???aC?CGA??G?_I_?IC??S?O!6?acA_a?O_A_!4?EG@???K?_E???G?CG?q?IG?b?BS@?Q?t@a???O_?O?oGcG__G???gCGC?CGcWCW!4?G???AO?a???O??Q?O??S???C?SO?AG!4?O???D??H?CAgACIGOG_@?s?S@KBO??C!4?BG@Q!4?@GAg@GA?`I@AHAC?SOC??W_CC?C!6?G??O!4?GA?A?E?AOA?ACI?A??@?GP_?@??_?_P??@$
#67T?gQCagASAGQhAWAs@I_S@gAGPcAgPGaGT_HqCPcHQgAWAsAGqCHQgAhSAgQDgQCIPcGPIOcH_IOdOi?hSAOl?x?JSHqCh?iOdOQcGQGaKAT_CPi?hAc?hAGTG@cbOeG?TaWDGdOI_IS`GQd?T?TAT?Q@I_?H?OaCJOtASi?s?\AODQKqS`g_OFoGq@EGdOcXCaLAOf?IOaSaSgQ_OC_OigSgQOgBOEsGeGo?QGQWa[aSbO_D_HQKi@]?dO_T_?Q?GOAcCGe?cWeGUPEgUPeGQHOaXCWp?I_CGQGSgCoCIO_Wa[@eOKOT_KRcH_@WP_?FGXELOHSLYD}@W`ADyCXSk@MwDGShE{@{c_ZAhQCjSC`Ish?G@scHOk@wGpCWdWrkTgsIdItGTgBW`GDiDAlaDQDQdQdO`W`O`S`SDY@SJO@YdIPETGTATG@?TGTIDIPKbChO`SiSHsBOJOHQhQdQPc@KPU@[?g?s`CTAGPiPeHU`E`CP?DIDQlOlO@oH_P?P?P?tGt?tOLQl_FWDiDaLY`Id?d?d?t?P?dGdOd?lODgAS?OcO_O?@O@?`A?OCa?A_?O??ePa@_XcO_OcGtISgQ_CO_Gs?OcW?S_X?DOdOD?Pg@gPkpEtAta\AdAlQDGtGtGpK`S`S`OdGtAc?a?C?cOCPC?C_Y$
#66g
#109???O
#73!4?_
#109!33?_
#66!41?A??_
#102A
#66!15?_??O???G??G??OC??S?G
#102a!4?O?G
#66??A!5?A?GQ?I?G?I?G?AGA?I?_???G??HA???G!5?CAI?G?AC???A?A??O??G??C?G???@?H?PG@?@?@?d?Cg?@?P?BGd?`?@?@?GI?IOE?@??_?OEGAGQc@a@o@?`O@?@?@_?_@??O@??HC?@?A`?@Q`?`?@OH_PC@?@?_O?@?_?O?OAS_C?CoGc?_?_E???_?_COg?@_?A@O_@A?`??@?A@?C_?c`O?HOc@AGdOI@Q?BCAd?GB?E???@???O?A_??cAS_?_OO?g?O??G?KOC??[?G?_??G_?GC??C?_?a?_aSO???__?_?OGOA?@?@???c??c??C??CO??_?_?AS@G?G_G_??CO??SGOGCO_?O!4?C?C?SGc?CG!4?G?C!4?O?_OOG?_?C?WOGOG?GG???O!6?OCHaC@ACHgOGgOC_O@O?GSHi?SHOG?cI?AGA?@???@?@G@GDO@A?A??@!4?gAG_CA?i?c??W?G???_O_??O_!7?g??O?G_?G?hC_XoI@A?qO?H?_$
#245!106?C!4?O!8?G
#8??A
#245!11?c??_G?_
#102?A???A?A
#245??A!8?C@?_!6?@
#8!5?A!9?G_!4?A???_?O?A?A!9?C
#61!13?C
#8???G?G?O
#60!8?_
#245!6?O_
#60???G!4?O???O
#245_
#96G??G???G
#61??C?A??a?C!8?A?A???I?C??G?a??A!6?A?I!6?A_!6?AC
#97!6?O??A
#61!6?A
#96!7?A
#97!5?A!5?A???A!5?A!7?O???A!4?A??G!9?G!6?G?_
#96!4?O???G?A
#97??_?_AO
#96!5?OA!6?_?A?G???A???G?_Q?_G??C@!8?A!4?A?C?A!8?C?G
#97!5?G_!7?G!9?O?G!5?G?G?G?G
#96!6?_!4?A??A?A
#243!13?A
#96???AO!4?a?A?_G!5?GDA?_@??O@?C
#102!5?PA?A???A??O?OA?C!8?G!5?_??G?C!8?OG??C?C??G?GA?AC$!160?A?GA??C_?C?C?ACAG?O??a?AO?Q?A?c?_G?ACA?A???_GAS!4?A?O?i?G?Q?AOIc?S?O?Q??OaCa?_O?@_GASI?I!4?A!7?AC!5?Q?A?C_A?CGA?Oa?C!5?O???CaCG?IO?GAK@iOA?DAC@?@?@I@??@?GA?OG?CG@A?A???G?G?C??G??CGAG???_!4?C!4?GO?A???C??G?OA??G?GO?S?S?C?C??_!10?G??O_???AO??_E!6?_?_???G??G?G@G??@!4?G??AC?O??ACA???G?c???_??C@??S?G?_AS@??CGC???G??C???O!4?cA?O?a??OA???Ic?SA!9?a???G??c?_Ao?oAG?_??QGO?G?_?OGO?Q?@???_?CaC?O!4?_?C@?OGOC???CG`AO_???O
#61?A??G???_???A!5?G??OA?A!8?GO??_!9?G!7?O
#97!4?G?A
#61???G
#243??_$
#8!165?A
#61!21?O
#96???G
#60!17?C
#97??A???A
#60!15?C
#8!25?GC!4?c?C
#60!11?_!7?O?G??C!16?G
#96!6?O!6?A
#60?O???_!7?O??A_?A??_Q?AO?_??Q?GQ?G?A_A!9?_!6?_!4?A?A!6?CO!4?O!4?O?G_G???A?GC?A??G?cA?CA??_G_G!4?C?A_AG?GOCA!4?G!4?G?GC_A!4?G???a??I?`!7?A!4?_!5?_G??A_!4?g!5?G_?c!8?OAG?A???_?CO???A???AC?GQ?EG??G_A@?G@I?H?CAcACQG@G_@Ah?CBGA??G!4?_@I?Q?_?ACOAc??A?dG@EGIc?g!4?O_?C!6?_AG_?A??GA?AOA_E_I?A_ICAOG??COI?A?G_G?@aC?C$
#8!255?!27?O???C???CO?O!6?cO!5?G??_?G!4?_!7?@??A@??O??@?C!6?C!8?O??@
#61!10?Q?A???A?O?A?A
#96!5?_???A?I!4?AG???A?AGA?A?A?A?G
#61?A!4?A!6?_!6?A?G???G
#97!8?GO!5?_??_???O!10?_??S?G?G_G?O??G???G?G
#96!24?_?G?C!7?CA
#61!4?A???A??_A!5?Q?A???_!5?A!7?CA?G!5?_!4?OC???C
#97!24?AGA?A?A!6?G
#96!25?A??A!7?A$
#245!255?!28?_G
#97??A?A!4?G?OC!5?A!7?GA?A!12?A??Q
#8!43?O!4?_??_?@??G@G`?PGBG@?@ID?L?D?D?D?DO@?`?b?@O@QDO@C@O@Q@_@AT?T???@GS?D?D?PC`?H?_S?o@O@O@?@Q@G@A@?O!14?P_HA@?`A@A@ADG@Q@O@A@O@A@?@a@?PA@?`Qh?L_DGD?@OD?`A@A@?DA@?`ADG@?dAd?DO!4?_?_?C@?@?@??GC@!5?_?C@?@?PCO_?C??GS?C?A?cGO?OcO_??h?@OD?@G@?@?PA`?T?T?L?DOd?T?DGP?P?@SH?`OD?d?O@?A@OcPC@CGO@Y$
#61!255?!196?AC!4?A!7?C!4?AG!7?A?A!4?C!8?A?A?O???A?A?A?A?A?A?A?A?A?A_
#97!33?G??_?C??O??O!14?G!4?A
#96!29?_!6?G?G?A???A???A???G
#243!15?A-
Interestingly, the libsixel/mpv output appears to be much larger than our own now.
we seem quite a bit faster now than we were, running e.g. ncplayer -bpixel -snone ../data/fm6.wmv
. our color selection still sucks ass.
i'm thinking of a new approach for colors (assuming we don't proceed with the idea in #1380):
this is a 64KiB table assuming 4x32-bit components, totally doable.
trying to take this to 32x32x32 would require half a meg, but allows us to skip in terms of 8
I'm staying on my current notcurses version for a little while longer, while doing some sixel optimizations in foot. After that I'll be happy to see how the latest ncplayer is faring in foot :)
i took a look at matchColor()
in @klamonte's Jexer. she's using an interesting method that makes use of HSL, as opposed to keeping things in RGB -- i'm not very familiar with HSL's place in color theory. is it a more easily searched space? either way, hslColors
is set up in makePalette()
. it's set up with what look to be linear gradiations based on the palette size.
ahhh, after reading up on HSL, yes, its value here is clear. we basically get our saturation and lightness for free, and then just have to find the closest hue. either way, this is a prepartition-and-match algorithm, very similar to what i'm describing above, except i'm proposing finer partitioning, and backfitting of the determined values. what does libsixel do?
it looks like libsixel uses a wholly static palette to start with (take a look at pal_xterm256[]
), but then extracts one dyanmically with sixel_quant_make_palette()
using median cut, which is pretty canonical and a strong approach.
octree is a valid method.
both the algorithm i proposed, and that used by Jexer, are popularity algorithms. median cut seems a better approach, though perhaps it ought be united with an HSL conversion to get the best of both worlds.
ahhh, libsixel first calculates a histogram to approximate the number of colors, and if they're few enough, doesn't generate an approximate palette. so it gets the exact palette in that case, just like we do.
a kohonen neural net might also be a good method
So when evaluating median cut v octrees v Kohonen nets, we desire:
file:///home/dank/Downloads/Kohonen_neural_networks_for_optimal_colour_quantiz.pdf is the best read i've found on Kahonen-based quantization
what if we had a constant MAXCOLOR-size array, and tracked the closest elements as we inserted, and just merged closest when full on a miss? there would be a lot of divisions....
it would appear that the Rust color_quant
crate uses Kohonen https://docs.rs/crate/color_quant/1.1.0
so earlier i tried merge-on-fill, and it broke down due to the cost of searching the entire table for the best merge, which we had to do for possibly every pixel once the table was full. what if instead we kept the merge range, and thus sorted quickly, usually skipping the walk?
no, attempts to further advance such simplistic algorithms are cowardice. let's push things forward.
Kohonen definitely appears to require a second pass against the computed palette, using a "manhattan distance" :( :( :(
Kohonen experiment: i'm training the neural net and emitting what looks to be a palette. now need to back-substitute and see if it looks any good.
initial Kohonen results include a 4x slowdown and the following garbage:
i'm pretty sure i'm fucking something up in the interchange, so i'll dick around with this for a minute. i'm sure we can improve the speed significantly. note that of the ~0.25s (up from ~0.075s), the majority of that is matching following the training, which is embarassingly suboptimal even discounting a double pass.
i bet by the end of the day i can have this as fast as our original solution, but generating much better image quality. the goal will then be to get it ~4x faster than our original solution, and thus hit a goal of ~0.020s. tally-fucking-ho!
interestingly enough, chunli23.png
looks correct (when using -snone
; using -sstretch
is another matter entirely). that's somewhat worrying; it suggests i'm not fucking up the interchange, and when we feed < 256 colors, things are right. but in that case, warmech.bmp
at -snone
would look correct, which it does not. hrmmm!
worldmap.png
seems almost fixed up. aidsrobots.jpeg
looks worse than when we started. covid19.png
has gone to hell.
I'm afraid I don't have much to contribute with here, but be sure I am following this with great interest :D
I'm afraid I don't have much to contribute with here, but be sure I am following this with great interest :D
i'll be reviewing your PR later! well, i imagine just merging it, but opening up the headroom for it either way. sorry about not getting to it already, just came out of bed determined to fuck up some Kohonen neural nets.
with another bug fixed in the interchange, worldmap.png
looks just about perfect, and covid19
is improved. aidsrobots.jpeg
, the one we had the most problems with in our original solution, still looks like ass.
-sstretch
with worldmap.png
looks right. with chunli23.png
and warmech.bmp
, it's definitely still wrong, but it's greatly improved from a few minutes ago
warmech.bmp
natural bias table:
BIAS 0 0 0 SRC 0 0 0
BIAS 1968 1968 1968 SRC 123 123 123
BIAS 4080 1968 1440 SRC 255 123 90
BIAS 4080 4080 4080 SRC 255 255 255
the bias table for -sstretch
is 34263 entries on 110x83, fuck me in the ass
[schwarzgerat](1) $ ./ncplayer -bpixel ../data/warmech.bmp -k 2>&1 > /dev/null | grep BIAS | sort -u | wc -l
34263
[schwarzgerat](0) $
it drives about as wide as one might expect (random sample):
BIAS 0 0 0 SRC 0 0 0
BIAS 4064 4064 4064 SRC 254 254 254
BIAS 3744 1792 1296 SRC 234 112 81
BIAS 1872 768 496 SRC 117 48 31
BIAS 0 0 0 SRC 0 0 0
BIAS 576 576 576 SRC 36 36 36
BIAS 0 0 0 SRC 0 0 0
BIAS 176 176 176 SRC 11 11 11
BIAS 336 336 336 SRC 21 21 21
BIAS 2400 1168 864 SRC 150 73 54
BIAS 1952 2016 2032 SRC 122 126 127
BIAS 3264 3264 3264 SRC 204 204 204
BIAS 672 672 672 SRC 42 42 42
BIAS 0 0 0 SRC 0 0 0
BIAS 4080 4080 4080 SRC 255 255 255
BIAS 3808 3808 3808 SRC 238 238 238
BIAS 3248 1584 1168 SRC 203 99 73
BIAS 1840 1872 1888 SRC 115 117 118
BIAS 0 0 0 SRC 0 0 0
BIAS 2528 2400 2368 SRC 158 150 148
so anyway, bias tables appear to be getting properly constructed, no crossovers there: BIAS 3840 1808 1296 SRC 240 113 81
etc.
now look at our neural table. lots of duplicate entries. also, shouldn't this be sorted throughout? it appears to only be sorted on the g index, and beyond that it's all over the place:
RGB[000]: 12 1 1
RGB[001]: 3 1 1
RGB[002]: 6 1 1
RGB[003]: 2 1 1
RGB[004]: 1 1 1
RGB[005]: 1 1 0
RGB[006]: 1 1 1
RGB[007]: 1 1 1
RGB[008]: 1 1 1
RGB[009]: 1 1 1
RGB[010]: 1 1 2
RGB[011]: 0 1 5
RGB[012]: 1 1 3
RGB[013]: 28 2 1
RGB[014]: 18 2 1
RGB[015]: 3 2 2
RGB[016]: 2 2 2
RGB[017]: 3 2 2
RGB[018]: 2 3 3
RGB[019]: 4 3 3
RGB[020]: 4 3 3
RGB[021]: 39 4 1
RGB[022]: 22 4 1
RGB[023]: 7 4 3
RGB[024]: 4 4 4
RGB[025]: 11 5 3
RGB[026]: 1 5 7
RGB[027]: 5 5 5
RGB[028]: 6 5 5
RGB[029]: 7 6 6
RGB[030]: 54 7 1
RGB[031]: 13 7 6
RGB[032]: 7 7 7
RGB[033]: 1 8 12
RGB[034]: 9 8 8
RGB[035]: 30 9 4
RGB[036]: 16 9 8
RGB[037]: 9 9 9
RGB[038]: 11 10 10
RGB[039]: 11 11 11
can that be right? need to look at inxsearch()
. man, i hate all those redundant entries =[.
you know, aidsrobots.jpg
looks good in the details in our original solution. it's just the wide swaths where we look like garbage, due to too wide of a mask. dithering might be able to fix that up well.
i'm not sure how long i can stand to sit here and look at AI haruspicy and Chaldean mysticism. how i fucking hate intelligent methods. a bunch of undebuggable filth. i kind of want to get back to the inversion-expansion algorithm. i think it has a real shot at being the fastest possible method, and we just have to avoid overly aggressive early merges. i think we can do so by using a centroid metric.
man, sixel can suck my dick. give me the kitty protocol any day. oh well, if libsixel can do, i can certainly do it; let's forge on. i'm going to stash this work and move back to expansion-inversion, which is the more i think about it, is just octrees in reverse, allowing the speed of octrees in O(1) size and the ability to do a single pass. bah to neural nets. doctors and other wizards are forbidden! https://www.youtube.com/watch?v=siwpn14IE7E
dankamongmen/sixel-quantize
has the kokonen experiment sitting around, but will hopefully be vaporized soon. i've picked everything out of there that was generally applicable, and applied it to master. onwards with the E-I idea aka "relaxation method".
dankamongmen/sixel-relaxation
will hold E-I work.
looking at the output of our old method, if i keep running sums and counts, we're getting some very mysterious results that might explain its problems:
RGB: 0 0 0 SUMS: 85 82 79 COUNT: 115993
RGB: 0 0 12 SUMS: 93 96 94 COUNT: 902
RGB: 0 12 0 SUMS: 77 88 106 COUNT: 12804
RGB: 0 12 12 SUMS: 103 87 81 COUNT: 50143
RGB: 0 12 25 SUMS: 89 99 111 COUNT: 9046
RGB: 12 0 0 SUMS: 112 110 107 COUNT: 139576
RGB: 12 0 12 SUMS: 109 99 90 COUNT: 45811
RGB: 12 12 0 SUMS: 110 89 104 COUNT: 2437
RGB: 12 12 12 SUMS: 110 119 137 COUNT: 11947
RGB: 12 12 25 SUMS: 121 131 140 COUNT: 10135
RGB: 12 12 37 SUMS: 145 144 141 COUNT: 134843
RGB: 12 25 12 SUMS: 125 128 126 COUNT: 611
RGB: 12 25 25 SUMS: 142 152 168 COUNT: 8462
RGB: 12 25 37 SUMS: 81 69 57 COUNT: 8240
RGB: 12 25 50 SUMS: 73 55 50 COUNT: 10845
RGB: 25 0 0 SUMS: 47 46 47 COUNT: 29373
RGB: 25 0 12 SUMS: 76 60 68 COUNT: 1350
RGB: 25 12 0 SUMS: 107 76 58 COUNT: 3078
RGB: 25 12 12 SUMS: 114 139 173 COUNT: 6586
RGB: 25 12 25 SUMS: 87 108 139 COUNT: 5525
RGB: 25 25 12 SUMS: 102 124 165 COUNT: 910
RGB: 25 25 25 SUMS: 154 163 169 COUNT: 11309
RGB: 25 25 37 SUMS: 175 185 198 COUNT: 5340
RGB: 25 25 50 SUMS: 185 196 200 COUNT: 10364
those sums don't seem to correlate with the rgb values at all (rgb is scaled to 100; sums are scaled to 255)
here's what we see if we use the true sums. very suspicious!
first go with E-I method is a definite improvement, just based off keeping sums, though it's not quite there:
improved: above old: below
this basically improves the accuracy but not the fitting. this allows for perfect matches of e.g. warmech.bmp
despite a high starting mask, though, and thus improves performance + quality. this cuts about 33% off execution time, and gets us close to 25ms / frame amortized. =]
oh yeah, this can play video just fine
ohhhhhhhhhhh yeahhhhhhhhhhhhhh, a very nice win
yeah we're now about as fast as libsixel, and though we don't look quite as good, we look way better than we did. let's go take a look at ncplayer
vs mpv -vo sixel
on some media...
so yeah this is looking very promising thus far. we've already picked up some improvement in accuracy, and are approaching performance goals. this is basically median cut with a starting accelerant. we distribute into 0xc0-masked octants, then relax based on some measure (something like lgCount + totalerr / count). when we relax, we go through the sixels and "unzip" them (think DNA anaphase) -- we can do this perfectly, since we already have sixels bound to the cleaved group. yesssss.
here's a good example of 0xf0 beating 0xc0 (0xc0 is on the right, 0xf0 is on the left):
as you can see, the improvements to our algorithm have done wonders for correctness, but we lose out on coverage for now. once relaxation is complete, i believe we'll be competitive, plus faster.
Just FWIW, with #1401 done, we're plenty fast with full RGB in kitty, and likewise fast enough with Sixel. if we can bring in the better quality with relaxation (which i started on earlier this evening), and keep up the speed, we'll be better than mplayer -vo sixel
for sure. =]
alright, got a first pass working. definite improvement:
old:
new:
old on left, new on right
i suspect remaining difference to be due to dithering, though i haven't proven that out yet
@dnkl pointed this out in #1380, but it's worth its own issue. Get two xterms up, along with a recent
mpv
build featuring libsixel support. Run it alongsidencplayer -bpixel
.mpv
's superiority is obvious (we beat it with cell output, though). Get to this level of output, in both speed and quality.