dankamongmen / notcurses

blingful character graphics/TUI library. definitely not curses.
https://nick-black.com/dankwiki/index.php/Notcurses
Other
3.43k stars 113 forks source link

race condition/hang during initialization on ms-terminal #2694

Open lotheac opened 1 year ago

lotheac commented 1 year ago

Hi, thanks for notcurses;

I'm seeing some hangs in notcurses initialization in terminfo interrogation phase. This doesn't happen every time, and using TRACE logging I'm unable to reproduce it, so I suspect it's a race condition.

Environment:

thread backtraces:

(lldb) process attach --pid 7305
bwarning: This version of LLDB has no plugin for the language "assembler". Inspection of frame variables will be limited.
Process 7305 stopped
* thread #1, name = 'notcurses-info', stop reason = signal SIGSTOP
    frame #0: 0x00007f85e3ef83c1 ld-musl-x86_64.so.1`__cp_end at syscall_cp.s:29
  thread #2, name = 'notcurses-info', stop reason = signal SIGSTOP
    frame #0: 0x00007f85e3ef83c1 ld-musl-x86_64.so.1`__cp_end at syscall_cp.s:29
Executable module set to "/home/lotheac/src/notcurses/build/notcurses-info".
Architecture set to: x86_64-unknown-linux-musl.
(lldb) bt all
* thread #1, name = 'notcurses-info', stop reason = signal SIGSTOP
  * frame #0: 0x00007f85e3ef83c1 ld-musl-x86_64.so.1`__cp_end at syscall_cp.s:29
    frame #1: 0x00007f85e3ef55ba ld-musl-x86_64.so.1`__syscall_cp_c(nr=202, u=<unavailable>, v=<unavailable>, w=<unavailable>, x=<unavailable>, y=<unavailable>, z=0) at pthread_cancel.c:33:6
    frame #2: 0x00007f85e3ef4ba9 ld-musl-x86_64.so.1`__timedwait_cp at __timedwait.c:24:6
    frame #3: 0x00007f85e3ef59c3 ld-musl-x86_64.so.1`__pthread_cond_timedwait(c=0x00007f85de4aa8c0, m=0x00007f85de4aa898, ts=0x0000000000000000) at pthread_cond_timedwait.c:100:9
    frame #4: 0x00007f85e3ef5ce7 ld-musl-x86_64.so.1`pthread_cond_wait(c=<unavailable>, m=<unavailable>) at pthread_cond_wait.c:5:9 [artificial]
    frame #5: 0x00007f85e3e42be3 libnotcurses-core.so.3`inputlayer_get_responses(ictx=0x00007f85de4aa010) at in.c:2800:5
    frame #6: 0x00007f85e3e753ac libnotcurses-core.so.3`interrogate_terminfo at termdesc.c:1156:15
    frame #7: 0x00007f85e3e55978 libnotcurses-core.so.3`notcurses_core_init at notcurses.c:1257:6
    frame #8: 0x0000562d1c512435 notcurses-info`main(argc=2, argv=0x00007ffc31798938) at main.c:499:26
    frame #9: 0x00007f85e3ebc9ca ld-musl-x86_64.so.1`libc_start_main_stage2(main=(notcurses-info`main at main.c:483:38), argc=2, argv=0x00007ffc31798938) at __libc_start_main.c:95:2
    frame #10: 0x0000562d1c513367 notcurses-info`_start + 22
  thread #2, name = 'notcurses-info', stop reason = signal SIGSTOP
    frame #0: 0x00007f85e3ef83c1 ld-musl-x86_64.so.1`__cp_end at syscall_cp.s:29
    frame #1: 0x00007f85e3ef55ba ld-musl-x86_64.so.1`__syscall_cp_c(nr=271, u=<unavailable>, v=<unavailable>, w=<unavailable>, x=<unavailable>, y=<unavailable>, z=0) at pthread_cancel.c:33:6
    frame #2: 0x00007f85e3ec0295 ld-musl-x86_64.so.1`ppoll(fds=<unavailable>, n=<unavailable>, to=<unavailable>, mask=<unavailable>) at ppoll.c:24:9
    frame #3: 0x00007f85e3e40789 libnotcurses-core.so.3`input_thread at in.c:2452:19
    frame #4: 0x00007f85e3ef608b ld-musl-x86_64.so.1`start(p=0x00007f85de474590) at pthread_create.c:203:2
    frame #5: 0x00007f85e3ef838e ld-musl-x86_64.so.1`__clone at clone.s:22

NCLOGLEVEL_VERBOSE output when the problem occurs:

get_tty_fd:482:returning TTY fd 3
send_initial_queries:521:sent 2852B
prep_special_keys:347:no terminfo declaration for kbeg
prep_special_keys:351:invalid escape: kbs (0x110380)
prep_special_keys:347:no terminfo declaration for kf0
prep_special_keys:347:no terminfo declaration for kent
prep_special_keys:347:no terminfo declaration for kclr
prep_special_keys:347:no terminfo declaration for kc1
prep_special_keys:347:no terminfo declaration for kc3
prep_special_keys:347:no terminfo declaration for ka1
prep_special_keys:347:no terminfo declaration for ka3
prep_special_keys:347:no terminfo declaration for kb2
prep_special_keys:347:no terminfo declaration for kbeg
prep_special_keys:347:no terminfo declaration for kcan
prep_special_keys:347:no terminfo declaration for kclo
prep_special_keys:347:no terminfo declaration for kcmd
prep_special_keys:347:no terminfo declaration for kcpy
prep_special_keys:347:no terminfo declaration for kext
prep_special_keys:347:no terminfo declaration for kprt
prep_special_keys:347:no terminfo declaration for krfr
prep_special_keys:347:no terminfo declaration for kBEG
prep_special_keys:347:no terminfo declaration for kBEG3
prep_special_keys:347:no terminfo declaration for kBEG4
prep_special_keys:347:no terminfo declaration for kBEG5
prep_special_keys:347:no terminfo declaration for kBEG6
prep_special_keys:347:no terminfo declaration for kBEG7
init_inputlayer:2541:spun up input thread
get_default_geometry:42:default geometry: 78 rows, 115 columns
prep_windows_special_keys:2044:added all windows special keys
prep_kitty_special_keys:1990:added all kitty special keys
prep_xtmodkeys:174:added all xtmodkeys
build_cflow_automaton:1777:added 0x7f85e3e3e810 via [E
build_cflow_automaton:1777:added 0x7f85e3e41510 via [<\N;\N;\NM
build_cflow_automaton:1777:added 0x7f85e3e414f0 via [<\N;\N;\Nm
build_cflow_automaton:1777:added 0x7f85e3e3d340 via [\N;\N;\Nt
build_cflow_automaton:1777:added 0x7f85e3e3e7d0 via [\Nu
build_cflow_automaton:1777:added 0x7f85e3e3e770 via [\N;\N~
build_cflow_automaton:1777:added 0x7f85e3e3e720 via [\N;\Nu
build_cflow_automaton:1777:added 0x7f85e3e3e6a0 via [\N;\N:\Nu
build_cflow_automaton:1777:added 0x7f85e3e3e830 via [\N;\N;\N~
build_cflow_automaton:1777:added 0x7f85e3e3e610 via [\N;\N:\N~
build_cflow_automaton:1777:added 0x7f85e3e3e5d0 via [1;\NP
build_cflow_automaton:1777:added 0x7f85e3e3e590 via [1;\NQ
build_cflow_automaton:1777:added 0x7f85e3e3e550 via [1;\NS
build_cflow_automaton:1777:added 0x7f85e3e3e510 via [1;\ND
build_cflow_automaton:1777:added 0x7f85e3e3e4d0 via [1;\NC
build_cflow_automaton:1777:added 0x7f85e3e3e490 via [1;\NB
build_cflow_automaton:1777:added 0x7f85e3e3e450 via [1;\NA
build_cflow_automaton:1777:added 0x7f85e3e3e410 via [1;\NE
build_cflow_automaton:1777:added 0x7f85e3e3e3d0 via [1;\NF
build_cflow_automaton:1777:added 0x7f85e3e3e390 via [1;\NH
build_cflow_automaton:1777:added 0x7f85e3e3e330 via [1;\N:\NP
build_cflow_automaton:1777:added 0x7f85e3e3e2d0 via [1;\N:\NQ
build_cflow_automaton:1777:added 0x7f85e3e3e270 via [1;\N:\NR
build_cflow_automaton:1777:added 0x7f85e3e3e210 via [1;\N:\NS
build_cflow_automaton:1777:added 0x7f85e3e3e1b0 via [1;\N:\ND
build_cflow_automaton:1777:added 0x7f85e3e3e150 via [1;\N:\NC
build_cflow_automaton:1777:added 0x7f85e3e3e0f0 via [1;\N:\NB
build_cflow_automaton:1777:added 0x7f85e3e3e090 via [1;\N:\NA
build_cflow_automaton:1777:added 0x7f85e3e3e030 via [1;\N:\NE
build_cflow_automaton:1777:added 0x7f85e3e3dfd0 via [1;\N:\NF
build_cflow_automaton:1777:added 0x7f85e3e3df70 via [1;\N:\NH
build_cflow_automaton:1777:added 0x7f85e3e3d2c0 via [?\Nu
build_cflow_automaton:1777:added 0x7f85e3e3eaf0 via [?1016;\N$y
build_cflow_automaton:1777:added 0x7f85e3e3ea70 via [?2026;\N$y
build_cflow_automaton:1777:added 0x7f85e3e41610 via [\N;\NR
build_cflow_automaton:1777:added 0 via [?1;1S
build_cflow_automaton:1777:added 0 via [?1;2S
build_cflow_automaton:1777:added 0 via [?1;3S
build_cflow_automaton:1777:added 0 via [?1;3;S
build_cflow_automaton:1777:added 0 via [?1;3;0S
build_cflow_automaton:1777:added 0 via [?2;1S
build_cflow_automaton:1777:added 0 via [?2;2S
build_cflow_automaton:1777:added 0 via [?2;3S
build_cflow_automaton:1777:added 0 via [?2;3;S
build_cflow_automaton:1777:added 0 via [?2;3;0S
build_cflow_automaton:1777:added 0x7f85e3e41530 via [?6c
build_cflow_automaton:1777:added 0x7f85e3e40280 via [?7c
build_cflow_automaton:1777:added 0x7f85e3e40280 via [?1;0c
build_cflow_automaton:1777:added 0x7f85e3e40280 via [?1;2c
build_cflow_automaton:1777:added 0x7f85e3e40280 via [?4;6c
build_cflow_automaton:1777:added 0x7f85e3e3f8a0 via [?\N;\Dc
build_cflow_automaton:1777:added 0x7f85e3e3d250 via [?1;0;\NS
build_cflow_automaton:1777:added 0x7f85e3e3d1c0 via [?2;0;\N;\NS
build_cflow_automaton:1777:added 0x7f85e3e3f340 via [>83;\N;0c
build_cflow_automaton:1777:added 0x7f85e3e3f4d0 via [>\N;\N;\Nc
build_cflow_automaton:1777:added 0x7f85e3e3ea20 via [=\Sc
build_cflow_automaton:1777:added 0 via P0+\S
build_cflow_automaton:1777:added 0x7f85e3e3fad0 via P1+r\S
build_cflow_automaton:1777:added 0x7f85e3e3ec90 via P!|\S
build_cflow_automaton:1777:added 0x7f85e3e3f120 via P>|\S
build_cflow_automaton:1777:added 0x7f85e3e3d170 via _G\S
build_cflow_automaton:1777:added 0x7f85e3e3f040 via ]10;rgb:\S
build_cflow_automaton:1777:added 0x7f85e3e3ef60 via ]11;rgb:\S
block_on_input:2462:poll returned 1
block_on_input:2480:got events: tI
read_input_nblock:2091:read 8B from 0 (1016B left)
block_on_input:2382:nonblocking read to check for completion
block_on_input:2462:poll returned 1
block_on_input:2480:got events: tI
read_input_nblock:2091:read 8B from 0 (1015B left)
block_on_input:2462:poll returned 1
block_on_input:2480:got events: tI
read_input_nblock:2091:read 8B from 0 (1016B left)
block_on_input:2462:poll returned 1
block_on_input:2480:got events: tI
read_input_nblock:2091:read 8B from 0 (1016B left)
block_on_input:2462:poll returned 1
block_on_input:2480:got events: tI
read_input_nblock:2091:read 6B from 0 (1018B left)

The input thread is working fine at this point: if I type something on stdin, I can see 1 byte reads in the log. But the other thread seems to be waiting for a condition that never happens.

lnussel commented 1 year ago

same on openSUSE Tumbleweed

notcurses 3.0.9 on XTerm 377 (Linux 6.1.7-1-default) 36 rows (21px) 127 cols (10px) 756x1270 rgb+256 colors gcc-12.2.1 20221020 [revision 0aaef83351473e8f4eb774f8f999bbe87a4866d7] (LE) terminfo 6.4.20230114 libdeflate 1.14 GPM n/a

(gdb) bt

0 __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0,

futex_word=0x556cd998c628) at futex-internal.c:57

1 __futex_abstimed_wait_common (futex_word=futex_word@entry=0x556cd998c628, expected=expected@entry=0,

clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0, cancel=cancel@entry=true) at futex-internal.c:87

2 0x00007f0b1694c08f in __GI___futex_abstimed_wait_cancelable64 (futex_word=futex_word@entry=0x556cd998c628,

expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0) at futex-internal.c:139

3 0x00007f0b1694ecd0 in __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x556cd998c5d8,

cond=0x556cd998c600) at pthread_cond_wait.c:503

4 ___pthread_cond_wait (cond=cond@entry=0x556cd998c600, mutex=mutex@entry=0x556cd998c5d8)

at pthread_cond_wait.c:618

5 0x00007f0b16b0f803 in inputlayer_get_responses (ictx=0x556cd9988550)

at /usr/src/debug/notcurses-3.0.9/src/lib/in.c:2800

6 handle_responses (kitty_graphics=, draininput=0, cursor_x=0x7fff06329ef4,

cursor_y=0x7fff06329ef8, tableused=0x7fff06329c50, tablelen=0x7fff06329c58, ti=0x556cd995e4a0) at /usr/src/debug/notcurses-3.0.9/src/lib/termdesc.c:1156

7 interrogate_terminfo (ti=, out=, utf8=1, noaltscreen=,

nocbreak=0, nonewfonts=0, cursor_y=0x7fff06329ef8, cursor_x=0x7fff06329ef4, stats=0x556cd995e230, lmargin=0, tmargin=0, rmargin=0, bmargin=0, draininput=0) at /usr/src/debug/notcurses-3.0.9/src/lib/termdesc.c:1431

8 0x00007f0b16aedcb2 in notcurses_core_init (opts=opts@entry=0x7fff0632a060, outfp=,

outfp@entry=0x0) at /usr/src/debug/notcurses-3.0.9/src/lib/notcurses.c:1257

9 0x00007f0b16b349e6 in notcurses_init (opts=opts@entry=0x7fff0632a060, outfp=outfp@entry=0x0)

at /usr/src/debug/notcurses-3.0.9/src/media/shim.c:13

10 0x0000556cd7aaa16b in main (argc=, argv=0x7fff0632a518)

at /usr/src/debug/notcurses-3.0.9/src/demo/demo.c:541

RickieBlaese commented 1 year ago

Can confirm, same experience here using Windows Terminal with WSL2. The hanging is spotty, but if I run the same executable in xterm, it does not happen. With gdb and backtrace, got the same output as @lnussel.