slevithan / xregexp

Extended JavaScript regular expressions
http://xregexp.com/
MIT License
3.31k stars 278 forks source link

Feature request: Character set from string array #292

Closed miloshavlicek closed 4 years ago

miloshavlicek commented 4 years ago

Feature request (for discussion): It would be useful to be able to create a new character set from a string array of allowed chars.

At this time I have my own static function that escapes and joins chars from string array, however, I think it can be useful also for others.

Example implementation:

const matchingChars = ['a','e','i','o','u', '+'];
const result = [];
const re = XRegExp(`${XRegExp.charsetFromArray(matchingChars)}+`); // expected translation: [aeiou\+]+
while (match = XRegExp.exec('heeeja+aola', re)) {
    result.push(match);
}
// expected output: ['eee', 'a+ao', 'a']
slevithan commented 4 years ago

How about:

const chars = 'aeiou+'; // or: ['a','e','i','o','u','+'].join('');
const re = XRegExp('[' + XRegExp.escape(chars) + ']+');
const result = XRegExp.match('heeeja+aola', re, 'all'); // ['eee','a+ao','a']

If it wasn’t for the bug described in #192 (where interpolation isn’t currently allowed inside character classes), you could replace the second line with const re = XRegExp.tag()`[${chars}]+`;.

mathiasbynens commented 4 years ago

Note that for usage with plain JS regular expressions (i.e. without XRegExp), regenerate solves this problem. Given a set of characters, it produces a valid & ASCII-safe JS regular expression pattern that matches only those characters.

regenerate('a','e','i','o','u', '+').toString();
// --> '[\\+aeiou]'

https://repl.it/repls/BackThirstySoftwareengineer