tc39 / proposal-reversible-string-split

String.prototype.splitn for ECMAScript
MIT License
82 stars 4 forks source link

Will a RegExp be allowed as the first argument? #5

Open brad4d opened 2 years ago

brad4d commented 2 years ago

The current split method accepts a RegExp as the first argument. In that case matched text for defined groups are included in the returned array.

EDIT: This is the original example which was incorrect as pointed out below.

const a = 'a|b|c|d';
a.split(/|/);    // ['a', '|', 'b', '|', 'c', '|', 'd']
a.split(/|/, 2); // ['a', '|']

EDIT: This is a valid example.

const a = 'a|b|c|d';
a.split(/(\|)/);    // ['a', '|', 'b', '|', 'c', '|', 'd']
a.split(/(\|)/, 2); // ['a', '|']

Will splitn allow RegExp arguments? If so, how will it treat them?

nicolo-ribaudo commented 2 years ago

The matched text is not included: you need to escape | in your regular expressions.

brad4d commented 2 years ago

Doh!

nevermind sorry.

brad4d commented 2 years ago

Well, it still needs to be clarified whether RegExps are allowed, right?

brad4d commented 2 years ago

Wait!, I'm not crazy after all

The "include the separators" behavior is triggered by the expression including groups.

a.split(/(\|)/);
(7) ['a', '|', 'b', '|', 'c', '|', 'd']
nicolo-ribaudo commented 2 years ago

Oh well, I didn't know about that! Then the initial question still holds completely.

brad4d commented 2 years ago

Yikes. Every group is included.

const b = 'a:sep1:b:sep2:c:sep3:d:sep4:e';
b.split(/((:)(\w+)(:))/); // ['a', ':sep1:', ':', 'sep1', ':', 'b', ':sep2:', ':', 'sep2', ':', 'c', ':sep3:', ':', 'sep3', ':', 'd', ':sep4:', ':', 'sep4', ':', 'e']

This definitely breaks the goal of having split be the opposite of join.

lucacasonato commented 2 years ago

Will include this. Yes, it should support regexp, just like perl does for example.

Edit: I.e not add the separators into the returned array.

hax commented 2 years ago

I believe it depends on the method name we finally choose.

If it is the name "splitn" or "splitN" I think we should support regexp. If choose another name like "explode", supporting regexp is not a must.

Precedent: String.prototype.replaceAll, the early version of it only support string for the best perf and the main pain point (s.replace("foo") doesn't work as expect ) in usage, but committee require it must support regexp unless it change to some other name like "substitute" .