r-lib / rex

Friendly regular expressions for R.
https://rex.r-lib.org
Other
334 stars 27 forks source link

Add match and substitute functions #3

Closed jimhester closed 10 years ago

jimhester commented 10 years ago

Add two functions to return matches and perform substitutions from a regex

Still using a pull request because I want your input on how the match function gets matches and returns. (we may be able to use regmatches instead for part of this.

As implemented the match() function returns different things based on the type of pattern.

  1. If there are no capture groups the function returns a logical for each vector in data
  2. If there are capture groups it returns a list for each capture group
  3. If there are named capture groups it returns a named list
  4. If the g option is passed a gregexpr is used, and a list of lists is returned with the matches

The other thing this function does is automatically prepend any additional options (?ims) ect. to the regular expression. We may want to make those options just a function. I originally had them as a match argument because that is how they are specified in perl, but R's implementation is actually slightly more flexible, you can turn the options on and off within a regex.

Please let me know your comments, or modify the function as you see fit!

jimhester commented 10 years ago

One issue with this is substitute is obviously a terrible name, as you can't use base::substitute then. It was simply s in perler

kevinushey commented 10 years ago

Maybe call them matches and substitutions? Or re_matches or re_substitutions just to avoid name collisions?

kevinushey commented 10 years ago

Can you follow the style conventions in r-pkgs? (I have been a bit lazy here too, but would prefer if we adopted this coding style throught.

jimhester commented 10 years ago

I tend to not worry about collisions (you can always use pkg::function syntax if you have a collision) rather than making the function name longer, but I will defer if you prefer them.

I'll go back over the code and try to fix the places where formatting differs from r-pkgs

kevinushey commented 10 years ago

That's true, but this is still not a universally adopted practice, and avoiding collisions when possible is the 'nice' thing to do (especially with functions in base)

On Thu, Sep 25, 2014 at 10:54 AM, Jim Hester notifications@github.com wrote:

I tend to not worry about collisions (you can always use pkg::function syntax if you have a collision) rather than making the function name longer, but I will defer if you prefer them.

I'll go back over the code and try to fix the places where formatting differs from r-pkgs

— Reply to this email directly or view it on GitHub https://github.com/kevinushey/rex/pull/3#issuecomment-56857340.

jimhester commented 10 years ago

fair enough, re_matches re_substitutions it is!