Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.9k stars 538 forks source link

Feature idea: s///g{} limit for g #20857

Open jidanni opened 1 year ago

jidanni commented 1 year ago

Problem: We know s/x/y/ matches one time, and s/x/y/g matches oh, e.g., 100 times. But what if we only want to match e.g., two times?

Syntax I'm proposing: Maybe /g{} off the top of my head. I'll leave it up to the experts.

Benefits:

#We want to add one space on both sides of the first two numbers.
$ echo x1xxx1xxx1.1xxx|perl -pwle 's/\d/ $& /;'
x 1 xxx1xxx1.1xxx #OK, that's the first one. Now,
$ echo x1xxx1xxx1.1xxx|perl -pwle 's/\d/ $& /g;'
x 1 xxx 1 xxx 1 . 1 xxx #No, we overdid it.
$ echo x1xxx1xxx1.1xxx|perl -pwle 's/\d/ $& /; s/\d/ $& /;'
x  1  xxx1xxx1.1xxx #No, no good either!

So that's why perl needs e.g., s/x/y/g{n}, so we could do:
$ echo x1xxx1xxx1.1xxx|perl -pwle 's/\d/ $& /g{2};'
x 1 xxx 1 xxx1.1xxx #Perfecto!

Sure, I know what you are all going to say. You are all going to say, "Well if you rewrite the pattern like ... then we in Perl Towers wouldn't need to face the facts that we could have "/g" keep a little counter if requested when encountering g{} (or however you want to write it) syntax, to quit after a certain count, that could make some people's lives a whole lot easier!"

Yes, surely my above example is easy to rewrite, but one day you will encounter one that ... just begs for the /g{} counter concept!

Note I am not talking about s/x{,}/y/!

Potential problems: none!

Sorry, I followed the instructions at https://github.com/Perl/RFCs/blob/main/README.md Alas, got From: MAILER-DAEMON@lists-nntp.develooper.com Subject: failure notice ... This is a permanent error

Leont commented 1 year ago

I like this idea, and am a bit surprised it hasn't been suggested before

demerphq commented 1 year ago

On Mon, 27 Feb 2023, 01:02 Leon Timmermans, @.***> wrote:

I like this idea, and am a bit surprised it hasn't been suggested before

I like it also, at least the idea behind the request. The syntax not so much.

I think @leonerd suggested something like this.

Having said that I'll repeat what I said to leonerd: I'm disinclined to shoehorn anything more into the modifiers for m// and s///. It's a serious pita to deal with at multiple levels.

I'd much prefer we figure out how to turn both into functions where we have the ability to supply a list of options. Or at least a better way to supply options to the operators. So the proposed syntax for me is problematic. If we support something hashlike as part of the modifiers it should be a hash of options, not something restricted to a single use case.

Yves

jidanni commented 1 year ago

Yes, I literally used on 0.2 seconds to think up the syntax, so please pick something better.

Should be super simple to implement:

jidanni commented 1 year ago

Wait! I got it even more bonus better idea too! Well you know that {,} syntax...

Well /g{5..9,37} could mean

Anyways, you guys figure out your favorite syntax. And also while you're at it how you going to code it because guys well I never coded anything! See we here in the ideas department don't mess with that stuff.

What's the use case for that 37 on the back you're probably asking.

Well it's for precision computer aided hair cuts.

Tux commented 1 year ago

So, $a =~ s{x}{y}g{,4} does it max 4 times and $a =~ s{x}{y}g{4,} only does it if it can do it 4 times or more? (that last one looks pretty daunting to implement)

demerphq commented 1 year ago

(that last one looks pretty daunting to implement

Yeah, good point.

jidanni commented 1 year ago

I'm telling you guys, this is kid stuff. g{2,5} would work like:

$ seq 7 | perl -pwle 's/$/*/ if $. >= 2 && $. <= 5 ;'
1
2*
3*
4*
5*
6
7

Except of course horizontally not vertically :-) For g{2,} just don't set the second term above. For g{,5} just don't set the first term above.

demerphq commented 1 year ago

On Fri, 3 Mar 2023 at 01:30, 積丹尼 Dan Jacobson @.***> wrote:

I'm telling you guys, this is kid stuff. g{2,5} would work like:

$ seq 7 | perl -pwle 's/$// if $. >= 2 && $. <= 5 ;' 1 2 3 4 5* 6 7

Except of course horizontally not vertically :-)

Umm, now you mean something different than Tux and I thought you meant. If this was comparable to numeric quantifier syntax then {5,6} would mean "update only if you can find 5 to 6 places to update". So for instance:

$_="a1234b"; s/\d/*/{5,6};

would NOT be expected to change anything, as there are only 4 digits in the string. However:

$_="a1234b"; s/\d/*/{2,3};

would be expected to leave $_ as "a***4b".

What you just said is different, you want {2,5} to replace only the second to the 5th match. Which is not the same thing.

Also, please be aware that basically NOTHING in this proposal is "kids play" in a practical sense. The related infrastructure is really awkward and difficult, and nothing about it was fun. Karl and I are the last people to mess with it at all, and it wasn't pretty, and that was just to add boolean flags to the behavior. You are asking for something a couple of orders of magnitude more complicated.

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

jidanni commented 1 year ago

Sorry! In my 'seq' example I was just trying to give an example of adding a counter filter to a loop. I should have stayed away from the 's' operator entirely, in order not to confuse things. In fact I should never have posted it.

rsFalse commented 1 year ago

Good idea. Currently one can use a clumsy '(??{ ... })' inside the regex: $ echo x1xxx1xxx1.1xxx|perl -pwle 's/\d/ $& /g{2};' --> $ echo x1xxx1xxx1.1xxx|perl -pwle 's/\d(??{ ++$i > 2 ? "(*FAIL)" : "" })/ $& /g;' x 1 xxx 1 xxx1.1xxx But I suppose, we would also like if after the second match, search would terminate.