martanne / vis

A vi-like editor based on Plan 9's structural regular expressions
Other
4.25k stars 258 forks source link

Sam x command bug? #774

Closed sh4r1k7 closed 4 years ago

sh4r1k7 commented 4 years ago

Summary:

Extracting org-mode sections skips every second one. (Could very well be that my command is incorrect but I can't see where...)

snapshot

Command:

x/^\*(.*\n)(^[^\*]*\n)+

Test data:

* title1
sentence

paragraph

* title2
sentence

paragraph

* title3
sentence
paragraph

p2
* title4
sen.
aksr commented 4 years ago

Nice find. I think it's a bug.

ninewise commented 4 years ago

I think the command you're looking for is x/\*.*\n([^*].*\n+)+/. I'm not sure your use of ^ inside the regex is valid: the manpage says "The anchors ^ and $ match the beginning / end of the range they are applied to."

It looks like the ^, in your command, also matches the beginning of a line. I'm thinking the bug here is that the beginning of the line is already included in the previous match, and therefore isn't scanned.

aksr commented 4 years ago

@ninewise: Still, it should work.

^ at the beginning (,x/^../) should work as expected, the second one is (I think) ignored. It (second ^) should have worked (as he expected), since it's after \n.

ninewise commented 4 years ago

You would indeed expect the first to work. This issue is more easily reproduced, btw, by calling :x/^.*\n/ on any file: observe how only half of the lines are selected. Using :x/^.*$/ works as expected. I'm fairly sure the ending \n eats the ^ of the next line.

sh4r1k7 commented 4 years ago

Thanks guys!

Yep, the second ^ in my example is unnecessary in any case - so much for incremental development - but yea, you'd still expect it to work.

@ninewise your simplified test code lead me to trying x/.*/ which extracts each line in full with it's own cursor at the last char before \n - shouldn't this command select the entire file with a single cursor instead, since sam/vis doesn't operate on a line basis? And why is it not selecting the \n of each line - except empty lines, in which case, I guess it's selecting a null char?

image

Either the bug is there, or at least related, or I'm completely lost now...

aksr commented 4 years ago

@geburashka: .* means every character except \n. Pike wanted .* idiom to behave as in ed since that was what the users expected who wanted to use ed subset (,x s/../../) in sam. That's why to select one or more lines you would use (.*\n)+. BTW, there was a symbol in earlier sam versions which reffered to every character and a newline (@), but it wasn't necessary since it wasn't hard to write .*\n.

sh4r1k7 commented 4 years ago

Thanks @aksr, I do vaguely recall reading that somewhere sometime but I guess it didn't quite sink in. Can you summarise where vis is not line oriented then? Apart from the ^ and $ operators this behavior of .* seems to make vis very line-oriented indeed, being the most common regex operator (not that I'm complaining or that it matters much, just trying to understand it better). Maybe I should read Pike's paper again.

aksr commented 4 years ago

It's not that .* makes vis (or sam for that matter) line-oriented, it's just that . doesn't include newlines. If you want that, just use the construct: ,x .. (notice the obligatory space after x). Pike intentionally didn't want to make sam more radical, since that would risk its adoption. All this is explained in his papers, read them again:

http://doc.cat-v.org/bell_labs/structural_regexps/ http://doc.cat-v.org/plan_9/4th_edition/papers/sam/ http://doc.cat-v.org/bell_labs/sam_lang_tutorial/