dankamongmen / notcurses

blingful character graphics/TUI library. definitely not curses.
https://nick-black.com/dankwiki/index.php/Notcurses
Other
3.59k stars 112 forks source link

1.7.3-pre grid demo runs ~4x slower following mojibake #1012

Open dankamongmen opened 4 years ago

dankamongmen commented 4 years ago

In recent 1.7.3-pre (i.e. since #627's resolution), grid runs much, much more slowly than it used to -- iff the mojibake demo is run before it:

             runtime│ frames│output(B)│rendering│ %r│    FPS│TheoFPS║
══╤════════╤════════╪═══════╪═════════╪═════════╪═══╪═══════╪═══════╣
 0│     zoo│  11.41s│    647│   1.15Mi│ 432.46ms│  3│   56.7│  1.50K║
 1│    grid│   4.10s│    768│  77.94Mi│    3.91s│ 95│  187.4│ 196.29║
══╧════════╧════════╪═══════╪═════════╪═════════╪═══╪═══════╪═══════╝
             runtime│ frames│output(B)│rendering│ %r│    FPS│TheoFPS║
══╤════════╤════════╪═══════╪═════════╪═════════╪═══╪═══════╪═══════╣
 0│mojibake│  20.73s│    910│   4.56Mi│    1.10s│  5│   43.9│ 828.05║
 1│    grid│  33.86s│    768│  77.00Mi│   33.62s│ 99│   22.7│  22.85║
══╧════════╧════════╪═══════╪═════════╪═════════╪═══╪═══════╪═══════╝
             runtime│ frames│output(B)│rendering│ %r│    FPS│TheoFPS║
══╤════════╤════════╪═══════╪═════════╪═════════╪═══╪═══════╪═══════╣
 0│   intro│   3.05s│    225│   3.59Mi│ 636.76ms│ 20│   73.7│ 353.35║
 1│   eagle│   5.43s│    274│  13.43Mi│ 600.52ms│ 11│   50.4│ 456.27║
 2│    grid│   4.31s│    768│  77.91Mi│    4.12s│ 95│  178.2│ 186.27║
══╧════════╧════════╪═══════╪═════════╪═════════╪═══╪═══════╪═══════╝
dankamongmen commented 4 years ago

it's the damndest thing:

             runtime│ frames│output(B)│rendering│ %r│    FPS│TheoFPS║
══╤════════╤════════╪═══════╪═════════╪═════════╪═══╪═══════╪═══════╣
 0│    grid│   4.11s│    768│  76.48Mi│    3.91s│ 95│  186.8│ 196.17║
 1│    grid│   3.91s│    768│  76.21Mi│    3.70s│ 94│  196.6│ 207.41║
 2│    grid│   4.33s│    768│  75.98Mi│    4.12s│ 95│  177.4│ 186.50║
 3│mojibake│  20.69s│    904│   9.51Mi│    2.80s│ 13│   43.7│ 323.06║
 4│    grid│  17.94s│    768│  74.80Mi│   17.66s│ 98│   42.8│  43.50║
 5│    grid│   6.59s│    768│  74.62Mi│    6.32s│ 95│  116.6│ 121.49║
══╧════════╧════════╪═══════╪═════════╪═════════╪═══╪═══════╪═══════╝
              57.56s│   4744│ 387.60Mi│   38.51s│ 66│   82.4│
dankamongmen commented 4 years ago

I reordered them because grid running before mojibake makes the latter look much cooler than it does following box, but we still want to get to the bottom of this for sure.

dankamongmen commented 4 years ago

this behavior doesn't happen using kitty :/. kitty also reproduces the combining EGCs of mojibake without any problem (see #1011) -- could the two possibly be related?

dankamongmen commented 4 years ago

check this out: in xfce4-terminal (where i'm seeing this slowdown), it only happens on an 80-column terminal. make your terminal wider than it is tall, and the problem goes way. WUT.

dankamongmen commented 4 years ago

i'm thinking this calls for git bisect. i look forward to figuring out what's going on here!

dankamongmen commented 4 years ago

As suspected, the move to libunistring correlates with this behavior in performance. Well, that's what I idly suspected, but it still makes not a single bit of sense, ugh.

[grimes](0) $ git bisect good
301354a9ba843929ba8224d7c29bb9ba338b22d6 is the first bad commit
commit 301354a9ba843929ba8224d7c29bb9ba338b22d6
Author: nick black <dankamongmen@gmail.com>
Date:   Thu Sep 10 03:31:23 2020 -0400

    utf8_egc_len: rewrite using uc_is_grapheme_break() #882

 src/lib/egcpool.h | 44 ++++++++++++++++++++++----------------------
 tests/cell.cpp    | 32 ++++++++++++++++----------------
 2 files changed, 38 insertions(+), 38 deletions(-)
[grimes](0) $ 
dankamongmen commented 4 years ago

are we possibly running more loops in utf8_len_egc()? we shouldn't be, and the perf output lays all the blame at SYS_write(), but hrmmm.....

dankamongmen commented 4 years ago

nope, iteration patterns look pretty much the exact same. what the hell?

we're even writing the same scenes, are we not? well, no, we can't be, since one generates a bunch of crap and one doesn't...how is the different EGC lexing leading to that situation?

i suspect if we yank the two problematic planes out of mojibake, this goes away. let's see.

dankamongmen commented 4 years ago

yep.ye

nope, iteration patterns look pretty much the exact same. what the hell?

we're even writing the same scenes, are we not? well, no, we can't be, since one generates a bunch of crap and one doesn't...how is the different EGC lexing leading to that situation?

i suspect if we yank the two problematic planes out of mojibake, this goes away. let's see.

nope, iteration patterns look pretty much the exact same. what the hell?

we're even writing the same scenes, are we not? well, no, we can't be, since one generates a bunch of crap and one doesn't...how is the different EGC lexing leading to that situation?

i suspect if we yank the two problematic planes out of mojibake, this goes away. let's see.

as expected, if we disable person and subdivision-flags, grid following mojibake returns to full speed.

so....why?

dankamongmen commented 4 years ago

I doubt this has anything to do with it, but mbswidth() is surely broken for emoji of this type (combined glyphs closed over ZWJs), since it operates strictly codepoint-by-codepoint...

dankamongmen commented 4 years ago

well, with the change for #1011 in, this behavior can no longer be forced, whatever the demo order. I am sure, however, that it would come roaring back were we to reenable those two offending groups. I am at a total lack for an explanation as to what's causing this. It's like printing those EGCs shocks libVTE (remember, this never happened on alacritty/kitty). With those two removed, allglyphs similarly has its runtime cut by about 50% on VTE. Mysterious! Leaving this open until we get to the bottom of it.

dankamongmen commented 3 years ago

Verified that this still happens if we reenable the "person" and "subdivision-flags" blocks on mojibake:

             runtime│ frames│output(B)│rendering│%r│%w│    FPS│TheoFPS║
══╤════════╤════════╪═══════╪═════════╪═════════╪══╪══╪═══════╪═══════╣
 0│    grid│   6.37s│    768│ 128.28Mi│ 101.21ms│ 1│93│  120.5│  7.59K║
 1│    grid│   6.00s│    768│ 127.91Mi│ 102.66ms│ 1│92│  128.0│  7.48K║
 2│mojibake│  13.75s│    961│  15.53Mi│ 256.52ms│ 1│20│   69.9│  3.75K║
 3│    grid│  45.84s│    293│  49.18Mi│  63.01ms│ 0│99│    6.4│  4.65K║ABORTED
 4│    grid│   0.00s│      0│     0.00│    0.00s│ 0│ 0│    0.0│   0.00║SKIPPED
══╧════════╧════════╪═══════╪═════════╪═════════╪══╪══╪═══════╪═══════╝
              71.96s│   2790│ 320.90Mi│ 523.39ms│ 0│83│   38.8│
[schwarzgerat](0) $ 
WSLUser commented 3 years ago

It may help if you spin up an Ubuntu 18.04 instance and use the new ProcMon from MS for Linux perf gathering. It should hopefully illuminate pain points more easily for you including this one: https://github.com/microsoft/ProcMon-for-Linux

esotericist commented 2 years ago

something something necromancy mumble mumble

but i just experienced this bug on xubuntu 20.04 (albeit in mate terminal, rather than whatever xfce uses), and it was quite astonishing.

particularly since the FIRST time i attempted the demo set (which was just a few days ago, as of my request for updating INSTALL.md), i didn't have this problem. most recently, i rebuilt without libdeflate just for funsies just to see how resilient the cmake stuff is with missing libs that aren't strict requirements (i'm still learning to work with it, and seeing how projects behave is useful)...

so anyway, i ran the full demo set, and when it hit grid and basically halted, i was quite startled. it takes me an alarming 178.9s (give or take a couple seconds, but i did get that value twice) to run grid in a full demo. but, weirdly, running JUST mojibake -> grid doesn't exhibit this. instead, whiteout -> grid is my semi-reliable trigger condition.

no idea on if this is useful, but figured i should provide the data. (it's worth noting that whiteout does have some substantive combining errors for me similar to what i experience with parts of mojibake, although i'm not able to tell what glyphs are the problem)