Allow for better usability in word-based cursor movement and deletion

ghost commented 9 years ago

Hi again. More word stuff.

The standard unicode.IsSpace based word boundaries seem fine for most usages, but when for example using liner as a language REPL, it'd be nice to be able to tweak what constitutes a word for cursor movement and deletion purposes.

This patch adds the method:

// SetWordBreakers sets runes that will be considered stops for word-based
// cursor movement and deletion without requiring surrounding whitespace.
func (s *State) SetWordBreakers(breakers ...rune)

And updates wordLeft, wordRight and ctrlW accordingly.

It's best to just try those 3 actions in some examples with and without the patch to get a feel for the difference. e.g.

ed.SetWordBreakers('(', ')')
// and editing the line: 
// (def zip (proc (xs ys) (zip-with cons xs ys)))

and

ed.SetWordBreakers('.', '/')
// and editing the line: 
// https://github.com/peterh/liner

What do you think? This gives the means for a good boost in usability when liner is used for language REPLs, without complicating the implementation much.

peterh commented 9 years ago

Sorry for the delay. Somehow this slipped off my TODO list.

To be clear, are you looking for a way to make additional runes be treated as if they were whitespace, or are you looking for a new concept entirely?

If it's "more whitespace", I would prefer if SetWordBreakers took a func (r rune) bool instead of a list of runes. Then the implementation in Liner is far simpler, and the user of SetWordBreakers can omit (some types of) whitespace from the WordBreakers list if it wants to.

Something like (untested):

var isSpace = unicode.IsSpace
func SetWhitespaceFunc(ws func(r rune)bool) {
    isSpace = ws
}

and then s/unicode\.IsSpace/isSpace/g in line.go.

If you want an entirely new concept, I'll have to think about it for a bit. Feature creep in that direction starts to feel like "Parts of regex", and I don't want to be implementing another regex engine when Go already has regexp. I'm undecided, so feel free to convince me that this new concept is so awesome that I want to pull anyway.

ghost commented 9 years ago

In a sense, this is the opposite of alternate whitespace. What I called a breaker (admitedly poor naming) is a rune that has such word-like significance that it is always considered fully distinct from whatever surrounds it (notably that includes other instances of itself).

Talk of programming language REPLs aside, this is basically about allowing the editor's awareness to be optionally increased from two concepts (words, whitespace) to three (words, whitespace, punctuation).

It's no fun to edit most kinds of code without the concept of punctuation.

peterh commented 9 years ago

Checking what bash does, it treats punctuation the same as whitespace for the purposes of cursor movement (to next word, to previous word). It treats punctuation the same as alphanumerics for the purposes of delete word.

I don't like what bash does. A word should be the same group of glyphs for the purposes of all operations.

Have you tried treating punctuation the same as whitespace? I found it felt pretty good when cursoring around in bash, but the test strings I used are not going to be the same as your language.

ghost commented 9 years ago

As you mention, that is decent for left/right movement, but I don't like ctrlW steamrolling "punctuation" like (and along with) with whitespace.

On 3 Sep 2015, at 01:40, Peter Harris notifications@github.com wrote:

Checking what bash does, it treats punctuation the same as whitespace for the purposes of cursor movement (to next word, to previous word). It treats punctuation the same as alphanumerics for the purposes of delete word.

I don't like what bash does. A word should be the same group of glyphs for the purposes of all operations.

Have you tried treating punctuation the same as whitespace? I found it felt pretty good when cursoring around in bash, but the test strings I used are not going to be the same as your language.

— Reply to this email directly or view it on GitHub.

peterh commented 9 years ago

...except ctrlW doesn't steamroll whitespace. It leaves all the trailing whitespace after the previous word.

Unless you mean you want ctrlW to remove only single characters when there is punctuation at the cursor position? It seems the backspace key is sufficient for that use case.

Thinking about this some more, the next feature request is going to be treating CamelCase as multiple words (eg. Netbeans does this). We should make sure whatever system we devise can handle that use case, too.

ghost commented 9 years ago

I guess there's a difference between treating punctuation like-whitespace and like-whitespace-but-distinct-from-whitespace. e.g. (using | to mean cursor):

aaa   ,},|
# ctrlW
aaa|

vs

aaa   ,},|
# ctrlW
aaa   |

With my first stab code, individual instances of punctuation are treated as like-a-word-but-distinct-from-words and so yes, wordLeft/wordRight/ctrlW are single character operations when done into punctuation. Admittedly that does seem unusual and I'm not particularly attached to doing it that way.

CamelCase support seems a bridge too far to me. Even the compiler that someone feeds their camelCase-containing source into, which has perfect knowledge of the line's syntax, still considers the camelCase thing to be a single token.

ghost commented 8 years ago

Hello again. I haven't had a chance to do more on this. I'll close the PR since given there's no pressing demand, the PR needn't be in limbo on the main repo. If the same topic comes up, feel free to ping me to see what I've ended up with.

peterh / liner

Allow for better usability in word-based cursor movement and deletion #49