AlecAivazis / survey

A golang library for building interactive and accessible prompts with full support for windows and posix terminals.
MIT License
4.07k stars 350 forks source link

Intermittent False EOF or panic when run under `screen` #408

Closed mgabeler-lee-6rs closed 2 years ago

mgabeler-lee-6rs commented 2 years ago

What operating system and terminal are you using?

An example that showcases the bug. https://github.com/mgabeler-lee-6rs/survey-screen-eof-demo

What did you expect to see?

What did you see instead?

This is a bit on and off for whether it reproduces, but given a screen session that is "doing it", it becomes 100% reproducible in producing one of the failure modes, though which failure happens seems to be a bit variable run to run.

The demo app linked above is just a trivial call to AskOne with a Confirm, but it seems that pretty much any Ask will do it. This spun out of problems from multiple users with the gh CLI: https://github.com/cli/cli/issues/4739

Failure mode 1: panic

This looks superficially similar to #394, but I'm unsure if it is really the same.

? Does it work? (y/N) panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x4ec8ae]

goroutine 1 [running]:
github.com/AlecAivazis/survey/v2/terminal.Coord.CursorIsAtLineEnd(...)
        /home/mgl/src/go/pkg/mod/github.com/!alec!aivazis/survey/v2@v2.3.2/terminal/cursor.go:157
github.com/AlecAivazis/survey/v2/terminal.(*RuneReader).ReadLineWithDefault.func2(...)
        /home/mgl/src/go/pkg/mod/github.com/!alec!aivazis/survey/v2@v2.3.2/terminal/runereader.go:65
github.com/AlecAivazis/survey/v2/terminal.(*RuneReader).ReadLineWithDefault(0xc0001b26c0, 0xe010, {0x691658, 0x0, 0x0}, {0x0, 0x40caf4, 0x52bd80})
        /home/mgl/src/go/pkg/mod/github.com/!alec!aivazis/survey/v2@v2.3.2/terminal/runereader.go:321 +0x58e
github.com/AlecAivazis/survey/v2/terminal.(*RuneReader).ReadLine(...)
        /home/mgl/src/go/pkg/mod/github.com/!alec!aivazis/survey/v2@v2.3.2/terminal/runereader.go:37
github.com/AlecAivazis/survey/v2.(*Confirm).getBool(0xc000098000, 0x0, 0xc00009a048)
        /home/mgl/src/go/pkg/mod/github.com/!alec!aivazis/survey/v2@v2.3.2/confirm.go:57 +0x186
github.com/AlecAivazis/survey/v2.(*Confirm).Prompt(0xc000098000, 0xc00009a048)
        /home/mgl/src/go/pkg/mod/github.com/!alec!aivazis/survey/v2@v2.3.2/confirm.go:136 +0x105
github.com/AlecAivazis/survey/v2.Ask({0xc0000bbe00, 0x1, 0x1}, {0x526380, 0xc000096000}, {0x0, 0x0, 0x0})
        /home/mgl/src/go/pkg/mod/github.com/!alec!aivazis/survey/v2@v2.3.2/survey.go:308 +0x214
github.com/AlecAivazis/survey/v2.AskOne({0x581798, 0xc000098000}, {0x526380, 0xc000096000}, {0x0, 0x0, 0x4125b1})
        /home/mgl/src/go/pkg/mod/github.com/!alec!aivazis/survey/v2@v2.3.2/survey.go:251 +0x94
main.main()
        /home/mgl/src/survey-screen-eof-demo/main.go:28 +0x1a5

Failure mode 2: TTY garbage: This is a much more common failure mode

? Does it work? (y/N) failed: EOF
^[[67;274R^[[5;23Rmgl@nigripes:~/src/survey-screen-eof-demo$ 7;274R23R

Note that the "gobbledygook" is inserted into my terminal buffer.

Attempted Analysis:

Based on the "TTY garbage" mode, and from attaching a debugger to the demo app, it appears that one or more of the terminal control commands being sent is producing unexpected responses that are treated as user input. Running it under strace -r -v -f unfortunately seems to heisenbug it and produce a third failure mode where it gets the false EOF, but no TTY garbage. In this case, the only read that shows in strace is like so:

915027      0.000037 read(0, "", 4096)  = 0

Which ... does look like an EOF. FWIW, the preceding TCSETS was:

919540      0.000018 ioctl(0, SNDCTL_TMR_START or TCSETS, {c_iflags=0x506, c_oflags=0x1805, c_cflags=0x8bd, c_lflags=0x8a30, c_line=0, c_cc[VMIN]=0, c_cc[VTIME]=0, c_cc="\x03\x1c\x7f\x15\x04\x00\x00\x00\x11\x13\x1a\x00\x12\x0f\x17\x16\x00\x00\x00"}) = 0

(full strace: trace.txt)

Going back to stepping through the debugger, it looks like the "TTY garbage" is coming from the terminal responding to the DSR escape sequence sent during cursor.Cursor.Location(), and I think it may be happening because the second call in this sequence, in runereader.go, is returning an error that is ignored:

    // we get the terminal width and height (if resized after this point the property might become invalid)
    terminalSize, _ := cursor.Size(rr.Buffer())
    // we set the current location of the cursor once
    cursorCurrent, _ := cursor.Location(rr.Buffer())

However, for whatever reason, with the debugger attached, it seems to be usually failing on the first one instead, which produces the "panic" failure mode (I think) if it ends up seeing the actual DSR response as user input, or as the EOF if that somehow doesn't happen in time to prevent it getting an EOF error when it gets here in runereader.go:

        // wait for some input
        r, _, err := rr.ReadRune()
        if err != nil {
            return line, err
        }

Trying to trace down where some of the EOFs are coming from, I see the EOF seeming to come from this bit in stdlib's bufio.go:

    // Read new data: try a limited number of times.
    for i := maxConsecutiveEmptyReads; i > 0; i-- {
        n, err := b.rd.Read(b.buf[b.w:])
        if n < 0 {
            panic(errNegativeRead)
        }
        b.w += n
        if err != nil {
            b.err = err
            return
        }

This brought me back to the TCSETS call, with its VMIN and VTIME values of 0. It definitely doesn't seem like the code is expecting these settings, and is instead expecting the terminal i/o to still be in blocking i/o mode, even when having cleared canonical mode (ref: https://man7.org/linux/man-pages/man3/tcflow.3.html#:~:text=MIN%20%3D%3D%200%2C%20TIME%20%3D%3D%200%20(polling%20read)).

On a working terminal, I get the TCGETS/TCSETS calls having (among the other flags), c_cc[VMIN]=1, c_cc[VTIME]=0.

Which finally tells me how I can submit a PR to fix this :D I don't know why my screen sessions sometimes have VMIN=0, but since this application is configuring the terminal, it should configure it to the way it needs instead of assuming someone else will have :)

mislav commented 2 years ago

Wow, awesome job tracking this down, and thank you for the fix! 🎉