slevithan / xregexp

Extended JavaScript regular expressions
http://xregexp.com/
MIT License
3.26k stars 277 forks source link

Mixing named and unnamed captures #9

Closed hetsch closed 12 years ago

hetsch commented 12 years ago

Hi slevithan,

first thanks for your great library. Really, really appreciating your work!

Don't know if I'm doing something wrong, but consider the following:

var url = 'page/edit/en/4f55fbbab51bda0df1000001/unnamed'
var re = XRegExp('^page/edit/(?<language>[a-z]{2})/(?<entityId>[a-z0-9]{24})/(?:.*)$');
var match = XRegExp.exec(url, re);
console.log(match);
/**
output:

0: "page/edit/en/4f55fbbab51bda0df1000001/unnamed",
1: "en",
2:"4f55fbbab51bda0df1000001",
entityId:"4f55fbbab51bda0df1000001",    
index:0,
input:"page/edit/en/4f55fbbab51bda0df1000001/unnamed",
language:"en"
*/

My problem is that i have to capture the "unnamed" part of the url (?:.*). The match object doesn't hold me the value of "unnamed"...

Is XRegExp not able to mix named and unnamed captures, or am I am missing something?

Thanks!

slevithan commented 12 years ago

Thanks...I'm glad XRegExp is useful to you!

Unless I'm missing something, though...

(?: ) is not an "unnamed capture"; it is explicitly a noncapturing group. That's the whole point of using the (?: ) syntax instead of simply ( ).

XRegExp supports mixing named and unnamed captures just fine. ;-) Unnamed capturing groups of the form ( ) must be referred to by number. Noncapturing groups of the form (?: ) do not produce backreferences at all. This is standard behavior with nearly all regex flavors, including the native JavaScript RegExp syntax.

hetsch commented 12 years ago

Mindblowing - I'm a complete idiot :-)
Think I got a little bit too exited about named captures in javascript and forgott all other basics, hehe..

Thank's for your tutorial and help! :-)

hetsch commented 12 years ago

Once again,

i know this shouldn't be a mailing list but i'm not quite sure how to contact you otherwise. My problem is that i need to capture some parts of an url. It works fine, except if i have two or more parameters that have the same value. I would need to know that the "language" parameter has the index 3 and not index 2 in the result object so that i can extract all named parameters and check what's left in the result (unnamed)...

var re = new XRegExp('page/edit/(.*)/(.*)/(?<language>[a-z]{2})/(?<entityId>[a-z0-9]{24})/(.*)');
var url = 'http://localhost:5000/admin/page/edit/unnamed1/sameValue/sameValue/4f55fbbab51bda0df1000001/unnamed2'
console.log(XRegExp.exec(url, re));

/**
0 "page/edit/unnamed1/en/e...1bda0df1000001/unnamed2"   
1 "unnamed1"    
2 "en"
3 "en"  
4 "4f55fbbab51bda0df1000001"    
5 "unnamed2"
entityId "4f55fbbab51bda0df1000001"
index 28
input "http://localhost:5000/a...1bda0df1000001/unnamed2"
language "en"
*/

Is there some trick for that kind of problem? Sorry for my poor english, and again - thanks

maybe this gist would explain it a little bit more: https://gist.github.com/2035325

slevithan commented 12 years ago

Backreference numbers are assigned from left to right. Your language capturing group is the third from the left, so result.language and result[3] are equivalent.

Even though mixing named and unnamed capturing groups works fine, I do not recommend doing so because it can be confusing.