peterh / liner

Pure Go line editor with history, inspired by linenoise
MIT License
1.04k stars 132 forks source link

PromptWithSuggestion(): panic with non-ASCII characters #106

Closed ERnsTL closed 6 years ago

ERnsTL commented 6 years ago

goroutine 1 [running]: github.com/peterh/liner.(State).refreshSingleLine(0xc4200c4640, 0xc4201492d8, 0x7, 0x20, 0xc4201ae000, 0x50, 0x50, 0x52, 0x0, 0xc4201490d0) /home/x/code/go/src/github.com/peterh/liner/line.go:109 +0x632 github.com/peterh/liner.(State).refresh(0xc4200c4640, 0xc4201492d8, 0x7, 0x20, 0xc4201ae000, 0x50, 0x50, 0x52, 0x4ad03c, 0xc4200173a2) /home/x/code/go/src/github.com/peterh/liner/line.go:96 +0x108 github.com/peterh/liner.(*State).PromptWithSuggestion(0xc4200c4640, 0x6f6d00, 0x7, 0xc42006e2a0, 0x52, 0x52, 0x0, 0x0, 0x0, 0x0) /home/x/code/go/src/github.com/peterh/liner/line.go:601 +0x3c49 main.edittaskHandler(0xc4201a2180, 0x4, 0x4) [...]

- Expected behaviour: Prompt displayed with suggested value ready for editing
- Complete sample that reproduces the bug:

package main

import ( "fmt" "io"

"github.com/peterh/liner"

)

func main() { // open terminal line := liner.NewLiner() if !liner.TerminalSupported() { fmt.Println("WARNING: terminal not supported") } defer line.Close()

// configuration
line.SetTabCompletionStyle(liner.TabPrints)
line.SetCtrlCAborts(true) // let library handle interrupt and get it delivered nicely in the main loop instead of hand-rolled signal-handling Goroutine, channels etc.
//line.SetWordCompleter(completeWord)

title := "testprefix äöüß ÄÖÜẞ testsuffix"

title, err := line.PromptWithSuggestion("Title? ", title, len(title))
//title, err := line.PromptWithSuggestion("Title? ", title, len([]rune(title)))
// ^ works! not sure if error on my side or in liner. is this conversion really necessary?
if err != nil {
    if err == io.EOF {
        fmt.Println("unchanged")
        return
        }
    fmt.Println("ERROR: reading line:", err)
    return
}

}



There is also a line which says "works", but I don't usually have to do this conversion with other libraries. Could this be improved inside liner or is this extra conversion really required every time I use `PromptWithSuggestion()`? Could this be optimized? Or maybe is character handling with non-ASCII characters bugged?

Please check :-)

Greetings
peterh commented 6 years ago

It's a cursor position, so length in runes is correct. You can't place the cursor half way though a rune. Actually, it really should ignore combining code points (and maybe even count doublewidth runes twice), but it's too late to fix that now (it would be an incompatible change to the API).

With all that said, it shouldn't crash. Especially since larger-than-string-length is documented to work.

ERnsTL commented 6 years ago

So if I understand correctly: Avoiding the string-to-[]rune conversion would only be possible with a breaking API change?

Thus to avoid breaking the API, the string-to-[]rune conversion is neccessary and cannot be avoided?

peterh commented 6 years ago

While true, that's not quite what I was trying to say. Allow me to try again:

  1. Thank you for the bug report. The bug has been fixed. Now if you pass a pos that is too large, the cursor will start at the end of the line instead of causing a crash.
  2. The pos is in runes. What do you expect to happen if pos is in bytes, and you ask for a cursor position in mid-rune? I can't think of anything reasonable for liner to do with a mid-rune pos, so pos is in runes.
  3. Now that I think of it, pos should be in screen-space (runes, but combining characters don't count). Unfortunately, this would be a breaking change. But if I'd thought of it sooner, pos would have been in screen-space.
  4. If you want the cursor at the end of the line, pass -1 to avoid len([]rune(string)).
  5. If you want the cursor in the middle of the line, and you built the string from a header and a remainder, utf8.RuneCountInString(header) is probably better than len([]rune(header)) because it could avoid an allocation.
ERnsTL commented 6 years ago

Thanks Peter for taking the time for the detailed response and the fix in 8c1271fcf47f341a9e6771872262870e1ad7650c - now everything makes sense :-)

Learned a bit about terminal design (screen-space vs. Unicode).

Also thanks for the -1 special case - I will use that.

I didn't know about utf8.RuneCountInString(str) - thanks for the optimization pointer as well!

Great library of yours 👍 Looking forward to using it more projects. All the best!