martanne / vis

A vi-like editor based on Plan 9's structural regular expressions
Other
4.2k stars 258 forks source link

:y command selects more than expected #925

Open VincentBailly opened 3 years ago

VincentBailly commented 3 years ago

Hi,

I am new to vis and struggle to wrap my head around the :y command. I am not sure whether what I observe is a bug or by design: 1 - I have the following text on a line: visvissvisvis 2 - I create a selection so that the character s is both at the beginning and at the end of the selection: "visvissvisvis" image 3 - I type the following command to unselect the s characters: :y/s/ image 4 - It looks like the result selection is not what I expect (some "s"es are still selected). To be sure I am not tricked by empty selections looking not empty, I type gU 5 - I get the following: viSVIsSVIsVis instead of visVIssVIsvis image

The issue I see is that some "s" characters are selected and also the last v character became selected even though it was not part of the initial selection.

7v0lk0v commented 3 years ago

:y doesn't remove anything, it's a selector like :x only inversed. to better understand, use :x instead and you'll see that it selects all the 's' within the existing selection. again, :y does the inverse selection, hence everything except the 's'es.

:d is the delete command. or you can just hit d in normal mode after you made your selection.

VincentBailly commented 3 years ago

Thank you @3dc1d3 for the quick reply. I think that you misunderstood my issue. I re-worded it so hopefully now it is clearer.

ninewise commented 3 years ago

This does indeed sound like a bug (or at least it does not display expected behaviour). It seems to me that empty substrings (before the first match if it touches the start, between two adjacent matches, and after the last match if it touches the end) become single-character selections. The 1/5 (primary selection is the first of five selections) at the bottom makes this clear. So vi[svissvis]vis iwth :y/s/ should become vi[]s[vi]s[]s[vi]s[]vis but instead becomes vi[s][vi]s[s][vi]s[v]is.

VincentBailly commented 3 years ago

If it can help, the bug can also be reproduced by using :x/[^s]*/ instead of :y/s/. With the difference that in this case, the last v is not selected.

martanne commented 3 years ago

@ninewise's analysis is correct and I agree that for cases like this the current behavior isn't intuitive.

In vis a selection is always a non-empty range i.e. there are no cursors between two characters, but selections on certain characters. This is both due to terminal restrictions (typically there is only one "real" cursor) and because it makes sense. However, it doesn't cope well with these empty matches as produced by certain structural regex commands. When entering visual mode they are all right-extended i.e. vi[]s becomes vi[s].

This also breaks one of the ideas behind vis, namely that you should be able to split any sam pipeline into multiple individual commands and get the same result. For now I am not sure how to best deal with it. We could start adding heuristics when such empty matches should be ignored e.g. when they are followed by an (implied) print command. However, there are legitimate use cases, such as x/^/ to create new selections at the start of every line, which produce empty matches.

Comments and suggestions welcome.

Your second example involving negated character classes is a distinct issue. There was indeed a functional difference compared to sam. This particular case should be fixed by 65e7bcc2cbd62e217f500ea4ee542269a0d4b2f5, though the involved code is a bit sketchy, there might be other faulty corner cases.

martanne commented 3 years ago

I have been thinking a bit about the issue with zero length regex matches, but haven't really come up with a satisfactory solution.

To summarize, they occur with:

@ninewise suggested to just ignore zero length matches for the y command. But that would introduce an imbalance.

I also looked into the original sam behavior and for some cases it was different. For example x/[a-z]+/ y/-?/ matches every character individually whereas we would produce empty matches before each character. Commit 367764b679ae5addcab6f968149ea25cc98663b9 introduces the sam behavior.